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Abstract 

(N 

;> The relationship between Popper spaces (conditional probability spaces that sat- 

isfy some regularity conditions), lexicographic probability systems (LPS's) [Blume, 
Brandenburger, and Dekel 1991a; Blume, Brandenburger, and Dekel 1991b], and 
\Q ' nonstandard probability spaces (NPS's) is considered. If countable additivity is 

assumed, Popper spaces and a subclass of LPS's are equivalent; without the as- 
sumption of countable additivity, the equivalence no longer holds. If the state space 
is finite, LPS's are equivalent to NPS's. However, if the state space is infinite, NPS's 



O ■ are shown to be more general than LPS's. 

• rH 

"*-! 1 Introduction 

Probability is certainly the most commonly-used approach for representing uncertainty 
and conditioning the standard way of updating probabilities in the light of new informa- 
tion. Unfortunately, there is a well-known problem with conditioning: Conditioning on 
events of measure is not defined. That makes it unclear how to proceed if an agent 
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learns something to which she initially assigned probability 0. Although consideration of 
events of measure may seem to be of little practical interest, it turns out to play a critical 
role in game theory, particularly in the analysis of strategic reasoning in extensive-form 
games and in the analysis of weak dominance in normal-form games (see, for example, 
[Battigalli 1996; Battigalli and Siniscalchi 2002; Blume, Brandenburger, and Dekel 1991a; 
Blume, Brandenburger, and Dekel 1991b; Brandenburger, Friedenberg, and Keisler 2008; 
Fudenberg and Tirole 1991; Hammond 1994; Hammond 1999; Kohlberg and Reny 1997; 
Kreps and Wilson 1982; Myerson 1986; Selten 1965; Selten 1975]). It also arises in the 
analysis of conditional statements by philosophers (see [Adams 1966; McGee 1994]), and 
in dealing with nonmonotonicity in Artificial Intelligence (see, for example, [Lehmann 
and Magidor 1992]). 

There have been various attempts to deal with the problem of conditioning on events 
of measure 0. Perhaps the best known involves conditional probability spaces (CPS's). 
The idea, which goes back to Popper [1934, 1968] and de Finetti [1936], is to take as 
primitive not probability, but conditional probability. If fi is a conditional probability 
measure on a space W, then /i(V \ U) may still be undefined for some pairs V and U, but 
it is also possible that /j,(V | U) is defined even if /j,(U | W) = 0. A second approach, which 
goes back to at least Robinson [1973] and has been explored in the economics literature 
[Hammond 1994; Hammond 1999], the AI literature [Lehmann and Magidor 1992; Wilson 
1995], and the philosophy literature (see [McGee 1994] and the references therein) is to 
consider nonstandard probability spaces (NPS's), where there are infinitesimals that can 
be used to model events that, intuitively, have infinitesimally small probability yet may 
still be learned or observed. 

There is a third approach to this problem, which uses sequences of probability mea- 
sures to represent uncertainty. The most recent exemplar of this approach, which I focus 
on here, are the lexicographic probability systems (LPS's) of Blume, Brandenburger, and 
Dekel [1991a, 1991b] (BBD from now on). However, the idea of using a system of mea- 
sures to represent uncertainty actually was explored as far back as the 1950s by Renyi 
[1956] (see Section 3.4). A lexicographic probability system is a sequence (/xq, . . .) of 
probability measures. Intuitively, the first measure in the sequence, /io, is the most im- 
portant one, followed by /ii, /i 2 , and so on. One way to understand LPS's is in terms 
of NPS's. Roughly speaking, the probability assigned to an event U by a sequence such 
as (/io, A^i) can be taken to be Ho{U) + e/ii(U), where e is an infinitesimal. Thus, even if 
the probability of U according to /i is 0, U still has a positive (although infinitesimal) 
probability if fii(U) > 0. 

What is the precise relationship between these approaches? The relationship between 
LPS's and CPS's has been considered before. For example, Hammond [1994] shows that 
conditional probability spaces are equivalent to a subclass of LPS's called lexicographic 
conditional probability spaces (LCPS's) if the state space is finite and it is possible to 
condition on any nonempty set. 1 As shown by Spohn [1986], Hammond's result can 

1 Despite this isomorphism; it is not clear that conditional probability spaces are equivalent to LPS's. 
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be extended to arbitrary countably additive Popper spaces, where a Popper space is a 
conditional probability space where the events on which conditioning is allowed satisfy 
certain regularity conditions. As I show, this result is depends critically on a number 
of assumptions. In particular, it does not work without the assumption of countable 
additivity, it requires that we extend LCPS's appropriately to the infinite case, and it is 
sensitive to the choice of conditioning events. For example, if we consider CPS's where 
the conditioning events can be viewed as information sets, and so are are not closed under 
supsersets (this is essentially the case considered by Battigalli and Sinischalchi [2002]), 
then the result no longer holds. 

Turning to the relationship between LPS's and NPS's, I show that if the state space 
is finite, then LPS's are in a sense equivalent to NPS's. More precisely, say that two 
measures of uncertainty v\ and u 2 (each of which can be either an LPS or an NPS) are 
equivalent, denoted v-y m v 2 , if they cannot be distinguished by (real- valued) random 
variables; that is, for all random variables X and Y, E U1 (X) < E Ul (Y) iff E U2 (X) < 
E V2 (Y) (where E V {X) denotes the expected value of X with respect to v). To the extent 
that we are interested in these representations of uncertainty for decision making, then 
we should not try to distinguish two representations that are equivalent. I show that, in 
finite spaces, there is a straightforward bijection between ^-equivalence classes of LPS's 
and NPS's. This equivalence breaks down if the state space is infinite; in this case, NPS's 
are strictly more general than LPS's (whether or not countable additivity is assumed). 

Finally, I consider the relationship between Popper spaces and NPS's, and show that 
NPS's are more general. (The theorem I prove is a generalization of one proved by McGee 
[1994], but my interpretation of it is quite different; see Section 5.) 

These results give some useful insight into independence of random variables. There 
have been a number of alternative notions of independence considered in the literature of 
extended probability spaces (i.e., approaches that deal with the problem of conditioning 
on sets of measure 0): BBD considered three; Kohlberg and Reny [1997] considered two 
others. It turns out that these notions are perhaps best understood in the context of 
NPS's; I describe and compare them here. 

Many of the new results in this paper involve infinite spaces. Given that most games 
studied by game theorists are finite, it is fair to ask whether these results have any sig- 
nificance for game theory. I believe they do. Even if the underlying game is finite, the 
set of types is infinite. Epistemic characterizations of solution concepts often make use of 
complete type spaces, which include every possible type of every player, where a type de- 
termines an (extended) probability over the strategies and types of the other players; this 
must be an infinite space. For example, Battigalli and Siniscalchi [2002] use a complete 
type space where the uncertainty is represented by cps's to give an epistemic characteri- 
zation of extensive-form rationalizability and backward induction, while Brandenburger, 

It depends on exactly what we mean by equivalence. The same comment applies below where the word 
"equivalent" is used. See Section 7 for further discussion. I thank Geir Asheim for bringing this point 
to my attention. 
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Friedenberg, and Keisler [2008] use a complete type space where the uncertainty is repre- 
sented by LPS's to get a characterization of weak dominance in normal- form games. As 
the results of this paper show, the set of types depends to some extent on the notion of 
extended probability used. Similarly, a number of characterizations of solution concepts 
depend on independence (see, for example, [Battigalli 1996; Kohlberg and Reny 1997; 
Battigalli and Siniscalchi 1999]). Again, the results of this paper show that these notions 
can be somewhat sensitive to exactly how uncertainty is represented, even with a finite 
state space. While I do not present any new game-theoretic results here, I believe that 
the characterizations I have provided may be useful both in terms of defending particular 
choices of representation used and suggesting new solution concepts. 

The remainder of the paper is organized as follows. In the next section, I review 
all the relevant definitions for the three representations of uncertainty considered here. 
Section 3 considers the relationship between Popper spaces and LPS's. Section 4 considers 
the relationship between LPS's and NPS's. Finally, Section 5 considers the relationship 
between Popper spaces and NPS's. In Section 6 I consider what these results have to say 
about independence. I conclude with some discussion in Section 7. 

2 Conditional, lexicographic, and nonstandard prob- 
ability spaces 

In this section I briefly review the three approaches to representing likelihood discussed 
in the introduction. 

2.1 Popper spaces 

A conditional probability measure takes pairs U, V of subsets as arguments; /i(V, U) is 
generally written /i(V | U) to stress the conditioning aspects. The first argument comes 
from some algebra T of subsets of a space W; if W is infinite, T is often taken to be a 
ex-algebra. (Recall that an algebra of subsets of W is a set of subsets containing W and 
closed under union and complementation. A cr-algebra is an algebra that is closed under 
union countable.) The second argument comes from a set T' of conditioning events, that 
is, that is, events on which conditioning is allowed. One natural choice is to take T' to 
be T — 0. But it may be reasonable to consider other restrictions on T' . For example, 
Battigalli and Sinischalchi [2002] take T' to consist of the information sets in a game, 
since they are interested only in agents who update their beliefs conditional on getting 
some information. The question is what constraints, if any, should be placed on T' . 
For most of this paper, I focus on Popper spaces (named after Karl Popper), defined 
next, where the set T' satisfies four arguably reasonable requirements, but I occasionally 
consider other requirements (see Section 3.3). 
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Definition 2.1: A conditional probability space (cps) over (W 7 , T 7 ) is a tuple (W, J 7 , J 7 ', /x) 
such that T is an algebra over W, T' is a set of subsets of W (not necessarily an algebra 
over W) that does not contain 0, and \i : "Fx T' — > [0, 1] satisfies the following conditions: 

CP1. //([/ 1 17) = 1 if C/ G -F'. 

CP2. ji(Vi U V 2 | C/) = //(Fi | U) + M\/ 2 | U) if Vi n F 2 = 0, U G r, and V u V 2 G "F. 

CP3. fi(y | £/) = fi(y | x) x /i(x \u) if v c x c u, u,x e F', v e F. 

Note that it follows from CP1 and CP2 that //(• | [/) is a probability measure on (V4 7 , "F) 
(and, in particular, that /i(0 | [/) = 0) for each [/ G T 7 '. A Popper space over (W,J-) 
is a conditional probability space (W, "F, J 7 ', fi) that satisfies three additional conditions: 
(a) T' C JF, (b) T 7 ' is closed under supersets in T 7 , in that if V G ^, V C V', and 
V E T 7 , then V e T 7 ', and (c) if U E T' and fi(V \ U) ^ then V D C/ G JF'. If J 7 is a 
a-algebra and /x is countably additive (that is, if //(UV^ | [/) = I ^7) if the V^'s 

are pairwise disjoint elements of T 7 and U G T 7 '), then the Popper space is said to be 
countably additive. Let PopiW.J 7 ) denote the set of Popper spaces over (W, T 7 ). If T is 
a o"-algebra, I use a superscript c to denote the restriction to countably additive Popper 
spaces, so Pop c {W^T) denotes the set of countably additive Popper spaces over (W, T). 
The probability measure /i in a Popper space is called a Popper measure. I 

The last regularity condition on T' required in a Popper space corresponds to the obser- 
vation that for an unconditional probability measure //, if /i(V \ U) ^ then /i(VnU) ^ 0, 
so conditioning on V (lU should be defined. Note that, since this regularity condition de- 
pends on the Popper measure, it may well be the case that (W, T 7 , T 7 ', fi) and (W, T 7 , T 7 ', v) 
are both cps's over (W 7 ,^ 7 ), but only the former is a Popper space over (W 7 ,^ 7 ). 

Popper [1934, 1968] and de Finetti [1936] were the first to formally consider con- 
ditional probability as the basic notion, although as Renyi [1964] points out, the idea 
of taking conditional probability as primitive seems to go back as far as Keynes [1921]. 
CP1-3 are essentially due to Renyi [1955]. Van Fraassen [1976] defined what I have called 
Popper measures; he called them Popper functions, reserving the name Popper measure 
for what I am calling a countably additive Popper measure. Starting from the work of de 
Finetti, there has been a general study of coherent conditional probabilities. A coherent 
conditional probability is essentially a cps that is not necessarily a Popper space, since it 
is defined on a set T x T' where T 1 does not have to be a subset of T 7 ); see, for example, 
[Coletti and Scozzafava 2002] and the references therein. Hammond [1994] discusses the 
use of conditional probability spaces in philosophy and game theory, and provides an 
extensive list of references. 

2.2 Lexicographic probability spaces 

Definition 2.2: A lexicographic probability space (LPS) (of length a) over (W 7 ,^ 7 ) is a 
tuple (W, T 7 , p) where, as before, W is a set of possible worlds and T is an algebra over 
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W, and fl is a sequence of finitely additive probability measures on (W, T) indexed by 
ordinals < a. (Technically, fl is a function from the ordinals less than a to probability 
measures on (W, J-).) I typically write fl as (fi®, • • •) or as (/ig : /3 < a). If JF is a 
a-algebra and each of the probability measures in fl is countably additive, then fl is a 
countably additive LPS. Let LPS^W 7 , JF) denote the set of LPS's over (W^JF). Again, if 
T is a er-algebra, a superscript c is used to denote countable additivity, so LPS C {W,T) 
denote the set of countably additive LPS's over (W, T). When (WjJ 7 ) are understood, I 
often refer to fl as the LPS. I write fl{U) > if fipiU) > for some (3. | 

There is a sense in which LPS(W,J r ) can capture a richer set of preferences than 
Pop{W,T), even if we restrict to finite spaces W (so that countable additivity is not 
an issue). For example, suppose that W = {wi,w 2 }, /io(^i) = ^0(^2) = 1/2, and 
A*i(wi) = 1. The LPS fl = (fio, fii) can be thought of describing the situation where w\ 
is very slightly more likely than w 2 . Thus, for example, if Xi is a bet that pays off 1 in 
state Wi and in state w^-i, then according to fl, X\ should be (slightly) prefereed to 
X 2 , but for all r > 1, rX 2 is preferred to Xi. There is no CPS on {wi,w 2 } that leads to 
these preferences 

Note that, in this example, the support of fi 2 is a subset of that of fXi. To obtain a 
bijection between LPS's and CPS's, we cannot allow much overlap between the supports 
of the measures that make an LPS. What counts as "much overlap" turns out to be a 
somewhat subtle. One way to formalize it was proposed by BBD. They defined a lexico- 
graphic conditional probability space (LCPS) to be an LPS such that, roughly speaking, 
the probability measures in the sequence have disjoint supports; more precisely, there 
exist sets Up G T such that fip(Up) = 1 and the sets Up are pairwise disjoint for (3 < a. 
One motivation for considering disjoint sets is to consider an agent who has a sequence 
of hypotheses (ho, hi, . . .) regarding how the world works. If the primary hyothesis ho 
is discarded, then the agent judges events according to hi] if hi is discarded, then the 
agent uses h 2 , and so on. Associated with hypothesis hp is the probability measure fj,p. 
What would cause hp to be discarded is observing an event U such that (J>p(U) = 0. The 
set Up is the support of the hypothesis hp. In some cases, it seems reasonable to think 
of the supports of these hypotheses as disjoint. This leads to LCPS's. 

BBD considered only finite spaces. When we move to infinite spaces, requiring dis- 
jointness of the supports of hypotheses may be too strong. Brandenburger, Friedenberg, 
and Keisler [2008] consider finite-length LPS's jl that satisfy the property that there exist 
sets Up (not necessarily disjoint) such that fip(Up) = 1 and fj,p(U y ) = for 7 7^ j3. Call 
such an LPS an MSLPS (for mutually singular LPS). Let a structured LPS (SLPS) be an 
LPS fl such that there exist sets Up G T such that Hp(Up) = 1 and HpiU^) = for 7 > (3. 
Thus, in an SLPS, later hypotheses are given probability according to the probabil- 
ity measure induced by earlier hypotheses, but earlier hypotheses do not necessarily get 
probability according the later hypotheses. (Spohn [1986] also considered SLPS's; he 
called them dimensionally well-ordered families of probability measures.) Clearly every 
LCPS is an MSLPS, and every MSLPS is an SLPS. If a is countable and we require 
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countable additivity (or if a is finite) then the notions are easily seen to coincide. Given 
an SLPS \x with associated sets Up,/3 < a, define UL — Up — (U 7>/ g£/ 7 ). The sets Up are 
clearly pairwise disjoint elements of J 7 , and /j,p(U'p) = 1. However, in general, LCPS's are 
a strict subset of MSLPS's, and MSLPS's are a strict subsets of SLPS's, as the following 
two examples show. 

Example 2.3: Consider a well-ordering of the interval [0,1], that is, a bijection from 
[0, 1] to an initial segment of the ordinals. Suppose that this initial segment of the 
ordinals has length a. Let ([0, l\,!F,p) be an LPS of length a where T consists of the 
Borel subsets of [0, 1]. Let fj, be the standard Borel measure on [0, 1], and let \ip be the 
measure that gives probability 1 to rp, the /3th real in the well-ordering. This clearly 
gives an SLPS, since we can take Uq = [0, 1] and Up = {rp} for < (3 < a; note that 
Ha{Up) = for (3 > a. However, this SLPS is not equivalent to any MSLPS (and hence 
not to any LCPS); there is no set Uq such that Ho{U' ) = 1 and U' is disjoint from rp for 
all (3 with < (3 < a. I 

Example 2.4: Suppose that W = [0, 1] x [0, 1]. Again, consider a well-ordering on [0, 1]. 
Using the notation of Example 2.3, define U ,p = rpx [0, 1] and U^p = [0, 1] x {rp}. Define 
Hi t p to be the Borel measure on U^p. Consider the LPS (/io,o, A*o,i, • • • , fJ>i,o, A*i,i, • • •)■ 
Clearly this is an MSLPS, but not an LCPS. I 

The difference between LCPS's, MSLPS's, and SLPS's does not arise in the work 
of BBD, since they consider only finite sequences of measures. The restriction to finite 
sequences, in turn, is due to their restriction to finite sets W of possible worlds. Clearly, 
if W is finite, then all LCPS's over W must have length < \W\, since the measures in an 
LCPS have disjoint supports. Here it will play a more significant role. 

We can put an obvious lexicographic order < £ on sequences (x ,Xi, . . .) of numbers 
in [0, 1] of length a: (xo,xi, . . .) <l (yo,yi, • • •) if there exists (3 < a such that xp < yp 
and x 1 = y-j for all 7 < f3. That is, we compare two sequences by comparing their 
components at the first place they differ. (Even if a is infinite, because we are dealing 
with ordinals, there will be a least ordinal at which the sequences differ if they differ at 
all.) This lexicographic order will be used to define decision rules. 

BBD define conditioning in LPS's as follows. Given jl and U G T such that j2(U) > 0, 
let jl\U — (//fc (- I U), I U), . . .), where (k , k±, . . .) is the subsequence of all indices for 
which the probability of U is positive. Formally, ko = min{/c : ^k{U) > 0} and for an 
arbitrary ordinal f3 > 0, if yt,u has been defined for all 7 < f3 and there exists a measure 
Us in A* such that ns{U) > and 5 > k^ for all 7 < (3, then kp = mm{5 : fJ>s(U) > 0, 5 > 
kry for all 7 < (3}. Note that j2\U is undefined if j2(U) = 0. 

2.3 Nonstandard probability spaces 

It is well known that there exist non- Archimedean fields — fields that include the real 
numbers as a subfield but also have infinitesimals, numbers that are positive but still 
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less than any positive real number. The smallest such non- Archimedean field, commonly 
denoted M(e), is the smallest field generated by adding to the reals a single infinitesimal 
e. 2 We can further restrict to non- Archimedean fields that are elementary extensions 
of the standard reals: they agree with the standard reals on all properties that can be 
expressed in a first-order language with a predicate N representing the natural numbers. 
For most of this paper, I use only the following properties of non- Archimedean fields: 

1. If 1R* is a non- Archimedean field, then for all b G Ft* such that — r < b < r for 
some standard real r > 0, there is a unique closest real number a such that \a — b\ is 
an infinitesimal. (Formally, a is the inf of the set of real numbers that are at least 
as large as b.) Let st{b) denote the closest standard real to b; st(b) is sometimes 
read "the standard part of b" . 

2. If st(e/e') = 0, then ae < e' for all positive standard real numbers a. (If ae were 
greater than e', then e/e' would be greater than 1/a, contradicting the assumption 
that st (e/e') = 0.) 

Given a non- Archimedean field M* , a nonstandard probability space (NPS) over (W, J 7 ) 
(with range Ft) is a tuple (WjJ 7 , fi), where W is a set of possible worlds, T is an alge- 
bra of subsets of W, and /i assigns to sets in T a nonnegative element of Ft such that 
pt(W) = 1 and fj,(U U V) = n{U) + n{V) if U and V are disjoint. 3 

If W is infinite, we may also require that T be a cr-algebra and that ji be countably 
additive. (There are some subtleties involved with countable additivity in nonstandard 
probability spaces; see Section 4.3.) 

3 Relating Popper Spaces to (S)LPS's 

In this section, I consider a mapping F$^p from SLPS's over (W, J-) to Popper spaces 
over (W, JF), for each fixed W and JF, and show that, in many cases of interest, Fs->p 
is a bijection. Given an SLPS (W, J 7 , ft) of length a, consider the cps (W, T ', T' ', //) such 
that T' = Up <a {V E T : n P {V) > 0}. For V G F, let (3 V be the smallest index 
such np v (y) > 0. Define fi(U | V) = Hp v {U \ V). I leave it to the reader to check that 
(W, J 7 ', fi) is a Popper space. 

There are many bijections between two spaces. Why is F$^p of interest? Suppose 
that F s ^p(W, J 7 , ft) = (W, J 7 , J 7 ', n). It is easy to check that the following two important 
properties hold: 

1. J 7 ' consists precisely of those events for which conditioning in the LPS is defined; 
that is, J 7 ' = {U : jl(U) > 0}. 

2 The construction of M(e) apparently goes back to Robinson [1973]. It is reviewed by Hammond 
[1994, 1999] and Wilson [1995] (who calls M(e) the extended reals). 

3 Note that, unlike Hammond [1994, 1999], I do not restrict the range of probability measures to 
consist of ratios of polynomials in e with nonnegative coefficients. 
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2. For U G J 7 ', fj,(-\U) = //(• | [/), where // is the first probability measure in the 
sequence (2\U. That is, the Popper measure agrees with the most significant prob- 
ability measure in the conditional LPS given U. Given that an LPS assigns to an 
event U a sequence of numbers and a Popper measure assigns to U just a single 
number, this is clearly the best single number to take. 

It is clear that these two properties in fact characterize F$^p. Thus, F$^p preserves the 
events on which conditioning is possible and the most significant term in the lexicographic 
probability. 

3.1 The finite case 

It is useful to separate the analysis of Fs^p into two cases, depending on whether or not 
the state space is finite. I consider the finite case first. 

BBD claim without proof that F$^p is a bijection from LCPS's to conditional prob- 
ability spaces. They work in finite spaces W (so that LCPS's are equivalent to SLPS's) 
and restrict attention to LPS's where T = 2 W and J 7 ' = 2 W — {0} (so that conditioning 
is defined for all nonempty sets). Since J 7 ' = 2 W — {0}, the cps's they consider are all 
Popper spaces. Hammond [1994] provides a careful proof of this result, under the restric- 
tions considered by BBD. I generalize Hammond's result by considering finite Popper 
spaces with arbitrary conditioning events. No new conceptual issues arise in doing this 
extension; I include it here only to be able to contrast it with the other results. 

Let SLPS{W, J 7 ) denote the set of LPS's over (W,F); let SLPS(W, J 7 , J 7 ') denote the 
set of LPS's (W, J 7 , jl) such that fl(U) > for all U G F (i.e., pip(U) > for some (3); as 
usual, I use a superscript c to denote countable additivity, so, for example, SLPS C (W, J 7 ) 
denotes the set of countably additive SLPS's over (W, J 7 ). Let Pop(W, J 7 , J 7 ') denote the 
set of Popper spaces of the form (WjJ 7 , J 7 ') and let Pop c (W, J 7 , J 7 ') denote the set of 
Popper spaces of the form (W, J 7 , J 7 ', //) where \i is countably additive. 

Theorem 3.1: IfW is finite, then F S ^ P is a bijection from SLPS(W, J 7 , J 7 ') to Pop(W, J 7 , J 7 '). 

Proof: It is immediate from the definition that if (W, J 7 , p) G SLPS(W, J 7 , J 7 '), then 
F S -*p(W, J 7 , Jl) G Pop(W, J 7 , J 7 '). It is also straightforward to show that F s ^p is an 
injection (see the appendix for details). The work comes in showing that F s ^p is a 
surjection (or, equivalently, in constructing an inverse to Fs^p). I sketch the main ideas 
of the argument here, leaving details to the appendix. 

Given ji G Pop(W, J 7 , J 7 '), the idea is to choose k < \W\ and k disjoint sets U , ...,£/& G 
J 7 ' appropriately such that fij = fi\ Uj for j = 0, . . . , k (i.e., fij(V) = fj,(V \ Uj)) amd 
Fs^p(W, J 7 , p) = /i. Since the sets Uo,...,Uk are disjoint, jl must be an SLPS. The 
difficulty lies in choosing Uq, . . . , £4 so that fl(U) > iff U G J 7 '. This is done as follows. 
Let U be the smallest set U G T such that fJ,(U) = 1. Since W is finite, there is such 
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a smallest set; it is simply the intersection of all sets U such that //([/ 1 W) = 1. Since 
/i(£/o | W) > 0, it follows that Uq G J-', li Uq J-', then (because T' is closed under 
supersets in T), no subset of Uq is in JF'. If Uq G J 7 ', let L/i be the smallest set in T such 
that /x(i7i | Uq) — 1. Note that [/i C [/ and that C/i G JF'. Continuing in this way, it is 
clear that there exists a k > and a sequence of pairwise disjoint sets Uq, Ui, . . . , Uk such 
that (1) C/j G J 7 ' for i = 0, . . . , k, (2) for % < k, U U . . . U U { G T' and U i+1 is the smallest 
subset of T such that /i(U i+1 \ Uq U . . . U C/j) = 1, and (3) C/o U . . . U Uk ^ T' . Condition 
(2) guarantees that Ui + \ is a subset of Uq U . . . U Ui, so the C/j's are pairwise disjoint. 
Define the LPS /2 = (/ii, . . . , by taking //j(V) = /i(V \ Ui). Clearly the support of ^ 
is so this is an LCPS (and hence an SLPS). I 

Corollary 3.2: IfW is finite, then F s ^p is a bijection from SLPS^, T) to Pop(W / , T). 
3.2 The infinite case 

The case where the state space W is infinite is not considered by either BBD or Hammond. 
It presents some interesting subtleties. 

It is easy to see that F$^p is an injection from SLPS's to Popper spaces. However, 
as the following two examples show, if we do not require countable additivity, then it is 
not a bijection. 

Example 3.3: (This example is essentially due to Robert Stalnaker [private commu- 
nication, 2000].) Let W = IN, the natural numbers, let T consist of the finite and 
cofinite subsets of IN (recall that a cofinite set is the complement of a finite set), and 
let T' = T - {0}. If U is cofinite, take ^{V \ U) to be 1 if V is cofinite and if V is 
finite. If U is finite, define ^(V | U) = \V fl U\/\U\. I leave it to the reader to check 
that (IV, T, J-', /i 1 ) is a Popper space. Note that fi 1 is not countably additive (since 
/x 1 ({i} | IN) = for all %, although y}{JN \ IN) = 1). Suppose that there were some LPS 
(IN, T , p) which was mapped by F S -+p to this Popper space. Then it is easy to check that 
if fii is the first measure in jl such that pLi(U) > for some finite set U, then Hi(U') > 
for all nonempty finite sets U' . To see this, note that for any nonempty finite set U' , 
since fJ>i(U) > 0, it follows that Hi(U U U') > 0. Since U U U' is finite, it must be the 
case that /ij is the first measure in fl such that /ii(U U U') > 0. Thus, by definition, 
H\U' \UUU')= m(U' | U U U'). Since ^(U' \ U U U') > 0, it follows that m(U') > 0. 
Thus, fjLi(U') > for all nonempty finite sets U'. 

It is also easy to see that Hi(U) must be proportional to \U\ for all finite sets U. 
To show this, it clearly suffices to show that /i«(n) = /ii(0) for all n G IN. But this is 
immediate from the observation that 

*({0} I {0,n}) = /({0} | {0,n}) = |{0}|/|{0,n}| = \ 
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But there is no probability measure fi^ on the natural numbers such that Hi(n) = /ij(0) > 
for all n > 0. For if /ij(0) > l/N, then fii({0, . . . , N — 1}) > 1, a contradiction. (See 
Example 4.8 for further discussion of this setup.) | 

Example 3.4: Again, let W = IN, let T consist of the finite and cofinite subsets of IV, 
and let T' = T— {0}. As with fi 1 , if U is cofinite, take fi 2 (V | U) to be 1 if V is cofinite and 
if V is finite. However, now, if U is finite, define fi 2 (V \ U) = 1 if max(V D U) — max U , 
and fi 2 {V \U) = otherwise. Intuitively, if n > n' , then n is infinitely more probable 
than n' according to fi 2 . Again, I leave it to the reader to check that (IV, T, T' , fi 2 ) is a 
Popper space. Suppose there were some LPS (IV, T , p) which was mapped by F s ^p to 
this Popper space. Then it is easy to check that if /i n is the first measure in fl such that 
H n {{n}) > 0, then jj, n comes before \i n i in jl if n > n! . However, since fl is well-founded, 
this is impossible. I 

As the following theorem, originally proved by Spohn [1986], shows, there are no 
such counterexamples if we restrict to countably additive SLPS's and countably additive 
Popper spaces. 

Theorem 3.5: [Spohn 1986] For all W , the map F$->p is a bijection from SLPS C (W / , T , J-') 
to Pov c (W,F,F'). 

Proof: Again, the difficulty comes in showing that F s ^p is onto. Given a Popper space 
(W, T , T' , fi), I again construct sets U , XJ\, . . . and an LPS fl such that Hp{V) = fi(V | Up), 
and show that Fs^piW, T, fl) = {W,T ,T' ,fi). However, now a completely different 
construction is required; the earlier inductive construction of the sequence Uq, ■ ■ ■ , no 
longer works. The problem already arises in the construction of Uq. There may no longer 
be a smallest set U such that fJ>(U ) = 1. Consider, for example, the interval [0, 1] with 
Borel measure. There is clearly no smallest subset U of [0, 1] such that fi(U) = 1. The 
details can be found in the appendix. | 

Corollary 3.6: For all W, the map F S _ P is a bijection from SLPS C (W^, T) to Pop c (H^, T). 

It is important in Corollary 3.6 that we consider SLPS's and not MSLPS's or LCPS's. 
F S ^ P is in fact not a bijection from MSLPS's or LCPS's to Popper spaces. 

Example 3.7: Consider the Popper space ([0, 1], T, T' , fx) which is the image under 
Fs^p of the SLPS constructed in Example 2.3. It is easy to see that this Popper space 
cannot be the image under F S ^ P of some MSPLS (and hence not of some LCPS either). 
I 
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3.3 Treelike CPS's 



One of the requirements in a Popper space is that T' be closed under supersets in T . 
If we think of T' as consisting of all sets on which conditioning is possible, this makes 
sense; if we can condition on a set U, we should be able to consider on a superset V of 
U. But if we think of T' as representing all the possible evidence that can be obtained 
(and thus, the set of events on which an agent must be be able to condition, so as to 
update her beliefs), there is no reason that T' should be closed under supersets; nor, for 
that matter, is it necessarily the case that if U G T' and fi(V \ U) ^ 0, then V fl U G J-'. 
In general, a cps where T' does not have these properties cannot be represented by an 
LPS, as the following example shows. 

Example 3.8: Let W = {wi, w 2 , w 3 , W4}, let T consist of all subsets of W, and let T' 
consist of all the 2-element subsets of W. Clearly T' is not closed under supersets. Define 
fi on T x T' such that fi{yj\ | {wi,w 3 }) = /i(w 4 | {w 2 ,W4}) = 1/3, and fi{wi | {wi,w 2 }) = 
/x(u>4 I {1V3, W4}) = 1/2, and CP1 and CP2 hold. This is easily seen to determine fi. 
Moreover, fi vaciously satisfies CP3, since there do not exist distinct sets U and X in T' 
such that U CI. It is easy to show that there is no unconditional probability fi* on W 
such that fi*(U | V) = fi(U \ V) for all pairs (U, V) e T x F such that p*{V) > (where, 
for /j,*, the conditional probability is defined in the standard way). 4 It easily follows that 
there is no LPS fl such that fl(U | V) = fi(U \ V) for all (U, V) E T x T' (since otherwise 
fio would agree with fi on all pairs (U, V) G T x T' such that fi{V) > 0). Had T' been 
closed under supersets, it would have included W. It is easy to see that it is impossible 
to extend fi to T x (F U {W}) so that CP3 holds. I 

In the game-theory literature, Battigalli and Siniscalchi [2002] use conditional proba- 
bility measures to model players' beliefs about other players' strategies in extensive-form 
games where agents have perfect recall. The conditioning events are essentially informa- 
tion sets; which can be thought of as representing the possible evidence that an agent can 
obtain in a agame. Thus, the cps's they consider are not necessarily Popper spaces, for 
the reasons described above. Nevertheless, the conditioning events considered by Batti- 
galli and Sinischalchi satisfy certain properties that prevent an analogue of Example 3.8 
from holding. I now make this precise. 

Formally, I assume that there is a one-to-one correspondence between the sets in 
T' and the information sets of some fixed player %. For each set U G T' , there is a 
unique information set ly for player i such that U consists of all the strategy profiles 
that reach l v . With this identification, it is immediate that we can organize the sets 
in T' into a forest (i.e., a collection of trees), with the same "reachability" structure as 
that of the information sets in the game tree. The topmost sets in the forest are the 
ones corresponding to the topmost information sets for player % in the game tree. There 

4 This example is closely related to examples of conditional probabilities for which there is no common 
prior; see, for example, [Halpern 2002, Example 2.2]. 
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may be several such topmost information sets if nature or some player j other than i 
makes the first move in the game. (That is why we have a forest, rather than a tree.) 
The immediate successors of a set U are the sets of strategy profiles corresponding to 
information sets for player % reached immediately after Ijj. Because agents have perfect 
recall, the conditioning events J 7 ' have the following properties: 

Tl. J 7 ' is countable. 

T2. The elements of J 7 ' can be organized as a forest (i.e., a collection of trees) where, 
for each U E J 7 ', if there is an edge from U to some U' £ J 7 ', then U' C U, all 
the immediate successors of U are disjoint, and U is the union of its immediate 
successors. 

T3. The topmost nodes in each tree of the forest form a partition of W. 

Say that a set J 7 ' is treelike if it satisfies Tl-3. It follows from T2 and T3 that, for any 
sets U and U' in a treelike set J 7 ', either U C U' (if U is a descendant of U' in some tree), 
U' C U (if U' is a descendant of U), or U and U' are disjoint (if neither is a descendant 
of the other). If J 7 ' is treelike, let T°(W, J 7 , J 7 ') consist of all countably additive cps's 
defined on J 7 x J 7 '. I abuse notation in the next result, viewing F$^p as a mapping from 
SLPS C (W, J 7 , J 7 ') to T C (W, J 7 , J 7 '). 

Proposition 3.9: The map F S -> P is a surjection from SLPS C ( J 7 , J 7 ') onto T C (W, J 7 , J 7 ') . 

Since J 7 ' is countable, every SLPS in SLPS C (W ) J 7 , J 7 ') must have at most countable 
length. Thus, there is no distinction between SLPS's, LCPS's, and MSPLS's in this 
case. (Indeed, in the proof of Proposition 3.9, the LPS constructed to demonstrate the 
surjection is an LCPS.) Note that we cannot hope to get a bijection here, even if W is 
finite. For example, suppose that W = {w±,W2}, J 7 = 2 W , and J 7 ' = {{wi}, {^2}}- J~' 
is clearly treelike, and there is a unique cps /i on (W, J 7 ', J 7 '). F$^p maps every SLPS in 
SLPS(W, J 7 , F') to /i, but is clearly not a bijection. (This example also shows that we 
do not get a bijection by considering MSLPS's or LCPS's either.) 

3.4 Related Work 

It is interesting to contrast these results to those of Renyi [1956] and van Fraassen [1976]. 
Renyi considers what he calls dimensionally ordered systems. A dimensionally ordered 
system over (W, J 7 ) has the form (W, J 7 , J 7 ', {/ij : i G /}), where T is an algebra of subsets 
of W, J 7 ' is a subset of J 7 closed under finite unions (but not necessarily closed under 
supersets in J 7 ), I is a totally ordered set (but not necessarily well-founded, so it may 
not, for example, have a first element) and is a measure on (W, J 7 ) (not necessarily a 
probability measure) such that 



13 



• for each U G J 7 ', there is some i E I such that < //*(£/) < oo (note that the 
measure of a set may, in general, be oo), 

• if fJ>i(U) < oo and j < i, then (J,j(U) = 0. 

Note that it follows from these conditions that for each U G J 7 ', there is exactly one i G I 
such that < fJ>i(U) < oo. 

There is an obvious analogue of the map Fs^p mapping dimensionally ordered sys- 
tems to cps's. Namely, let Fp,-,c m &P the dimensionally ordered system (W, J 7 , J 7 ', {/ij : 
% G /}) to the cps (W, J 7 , J 7 ', /i), where pi(V \ U) = ^(V \ U), where % is the unique element 
of / such that < Hi(U) < oo. Renyi shows that Fp>^c is a bijection from dimension- 
ally ordered systems to cps's where the set J 7 ' is closed under finite unions. (Csaszar 
[1955] extends this result to cases where the set J 7 ' is not necessarily closed under finite 
unions.) Renyi assumes that all measures involved are countably additive and that T 
is a a-algebra, but these are inessential assumptions. That is, his proof goes through 
without change if T is an algebra and the measures are additive; all that happens is that 
the resulting conditional probability measure is additive rather than cr-additive. 

It is critical in Renyi's framework that the jdii S £1X6 arbitrary measures, and not just 
probability measures. His result does not hold if the /ij's are required to be probability 
measures. In the case of finitely additive measures, the Popper space constructed in 
Example 3.3 already shows why. It corresponds to a dimensionally ordered space (/ii, ^2) 
where fii(U) is 1 if U is cofinite and if U is finite and ^(U) = \U\ (i.e., the measure of 
a set is its cardinality). It cannot be captured by a dimensionally ordered space where 
all the elements are probability measures, for the same reason that it is not the image 
of an SLPS under Fs^p. (Renyi [1956] actually provides a general characterization of 
when the /ij's can be taken to be (countably additive) probability measures.) Another 
example is provided by the Popper space considered in Example 3.4. This corresponds 
to the dimensionally ordered system {jip : j3 G IN U {00}}, where 



if max([7) < n 

1 if max([7) = n 
00 if max(L r ) > n, 



where max(C/) is taken to be 00 if U is cofinite. 

Krauss [1968] restricts to Popper algebras of the form T x (J 7 — {0}); this allows him 
to simplify and generalize Renyi's analysis. Interestingly, he also proves a representation 
theorem in the spirit of Renyi's that involves nonstandard probability. 

Van Fraassen [1976] proves a result whose assumptions are somewhat closer to The- 
orem 3.5. Van Fraassen considers what he calls ordinal families of probability measures. 
An ordinal family over (W, J 7 ) is a sequence of the form {(W 7 ^, JF^, pp) : (3 < a} such that 

• U p<a Wp = W; 
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• Tp is an algebra over Wp; 

• fip is a probability measure with domain Tp; 

• U^< Q ,jF /3 = JF; 

• if U G J 7 and V G Tp, then U (IV e Tp; 

• iff/ G JF, U C\V E Tp, and /^(C/ fl V) > 0, then there exists 7 such that [/ G JF 7 
and /jLj(U) > 0. 

Given an ordinal family {(Wyj, ^73, ///j) : /3 < a} over (W 7 , JF), consider the map 
Fq->c which associates with it the cps (W, T , T' , \x) , where JF' = {U G JF : ^(U) > 
for some 7 < a} and fj,(V \U) = (j,p(V\U), where f3 is the smallest ordinal such that 
U G J-p and fJ,p(U) > 0. Van Fraassen shows that F Q ^ C is a bijection from ordinal 
families over (W, J 7 ) to Popper spaces over (W, T). Again, for van Fraassen, countable 
additivity does not play a significant role. If JF is a cr-algebra, a countably additive or- 
dinal family over (W, T) is defined just as an ordinal family, except that now Tp is a 
a-algebra over Wp for all (3 < a, /x a is a countably additive probability measure, and JF is 
the least a-algebra containing Up^J-'p (since U / 3 <a jF /3 is not in general a a-algebra). The 
same map F Q ^c is also a bijection from countably additive ordinal families to countably 
additive Popper spaces. 

Spohn's result, Theorem 3.5, can be viewed as a strengthening of van Fraassen's 
result in the countably additive case, since for Theorem 3.5 all the Tp"$> are required to 
be identical. This is a nontrivial requirement. The fact that it cannot be met in the case 
that W is infinite and the measures are not countably additive is an indication of this. 

It is worth seeing how van Fraassen's approach handles the finitely additive examples 
which do not correspond to SLPS's. The Popper space in Example 3.3 corresponds 
to the ordinal family {(W n , J-" n , f/, n ) : n < 00} where, for n < u, W n = {l,...,n}, 
T n consists of all subsets of W n , and /i n is the uniform measure, while W w = IN, T w 
consists of the finite and cofinite subsets of IN, and fJ, w (U) is 1 if U is cofinite and if 
U is finite. It is easy to check that this ordinal family has the desired properties. The 
Popper space in Example 3.4 is represented in a similar way, using the ordinal family 
{{W n , J-'n, fi' n ) : n < u}, where (J>' n (U) is 1 if n G U and otherwise, while /i'^ = fi u . I 
leave it to the reader to see that this family has the desired properties. The key point 
to observe here is the leverage obtained by allowing each probability measure to have a 
different domain. 

4 Relating LPS's to NPS's 

In this section, I show that LPS's and NPS's are isomorphic in a strong sense. Again, I 
separate the results for the finite case and the infinite case. 
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4.1 The finite case 



Consider an LPS of the form (//!, /x 2 , /X3). Roughly speaking, the corresponding NPS 
should be (1 — e — e 2 )^! + e/i 2 + e 2 yU 3 , where e is some infinitesimal. That means that 
H2 gets infinitesimal weight relative to jii and /i 3 gets infinitesimal weight relative to 
But which infinitesimal e should be chosen? Intuitively, it shouldn't matter. No matter 
which infinitesimal is chosen, the resulting NPS should be equivalent to the original LPS. 
I now make this intuition precise. 

Suppose that we want to use an LPS or an NPS to compute which of two bounded, 
real-valued random variables has higher expected value. The intended application here is 
decision making, where the random variables can be thought of as the utilities corre- 
sponding to two actions; the one with higher expected utility is preferred. The idea is 
that two measures of uncertainty (each of which can be an LPS or an NPS) are equivalent 
if the preference order they place on (real valued) random variables (according to their 
expected value) is the same. I consider only random variables with countable range. This 
restriction both makes the exposition simpler and avoids having to define, for example, 
integration with respect to an NPS. Note that, given an LPS jl, the expected value of 
a random variable X is Y^ x xjl(X = x), where jl(X = x) is a sequence of probability 
values and the multiplication and addition are pointwise. Thus, the expected value is a 
sequence; these sequences can be compared using the lexicographic order <l defined in 
Section 2.2. If v is either an LPS or NPS, then let E V (X) denote the expected value of 
random variable X according to v. 

Definition 4.1: If each of v\ and z/ 2 is either an NPS over (W, J-) or an LPS over (W, T), 
then v\ is equivalent to z/ 2 , denoted v\ ~ z/ 2 , if, for all real- valued random variables X 
and Y measurable with respect to T, E Ul {X) < E Ul {Y) iff E U2 {X) < E U2 (Y). (If X 
has countable range, which is the only case I consider here, then X is measurable with 
respect to F iff {w : X(w) = x} e T for all x in the range of X.) 5 | 

This notion of equivalence satisfies analogues of the two key properties of the map 
F S ^ P considered at the beginning of Section 3. 

Proposition 4.2: If u e NPS(W,F), jl € LPS(W,.F), and v « jl, then u{U) > iff 
jl(U) > Moreover, if v{U) > 0, then st {y{V \ U)) = [Xj{V \ U), where /ij is the first 
probability measure in jl such that (J,j(U) > 0. 

5 As pointed out by Adam Brandenburger and Eddie Dekel, this notion of equivalence is essentially 
the same as one implicitly used by BBD. They work with preference orders on Anscombe-Aumann acts 
[Anscombe and Aumann 1963], that is, functions from states to probability measures on prizes. Fix a 
utility function u on prizes. Then take v\ ^ u v 2 if the preference order on acts generated by v\ and 
u is the same as that generated by v 2 and u. It is not hard to show that this notion of equivalence is 
independent of the choice of utility function; if u and u' are two utility functions on prizes, then v\ ^ u v 2 
iff v\ ~ u , v 2 - Moreover, v\ ^ u v 2 iff v\ ~ v 2 . The advantage of the notion of equivalence used here is 
that it is defined without the overhead of preference orders on acts. 
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As the next result shows, for SLPS's, the ^-equivalence classes are singletons, even 
if the set of worlds is infinite. (This is not true for LPS's in general. For example, 
(//,//) ~ (/i).) This can be viewed as providing more motivation for the use of SLPS's. 

Proposition 4.3: If p, p' G SLPS(W, T), then p pa p' iff pi = p' . 

The next result justifies restricting to finite LPS's if the state space is finite. Given 
an algebra J 7 , let Basic(J-) consist of the basic sets in J 7 , that is, the nonempty sets 
T that themselves contain no nonempty subsets in J 7 . Clearly the sets in Basi^J 7 ) are 
disjoint, so that \Basic(T)\ < \W\. If all sets are measurable, then Basic^J 7 ) consists of 
the singleton subsets of W. If W is finite, it is easy to see that all sets in T are finite 
unions of the sets in Basic^J 7 ). 

Proposition 4.4: If W is finite, then every LPS over {W^J 7 ) is equivalent to an LPS 
of length at most \Basic(J-)\. 

I can now define the bijection that relates NPS's and LPS's. Given (W : J 7 ), let 
LPS(W, J 7 ) /pa be the equivalence classes of ^-equivalent LPS's over (W, J 7 )] similarly, 
let NPS(W, J 7 ) /pa be the equivalence classes of ~-equivalent NPS's over (W, J 7 ). Note 
that in NPS(W, J 7 )/ ~, it is possible that different nonstandard probability measures 
could have different ranges. For this section, without loss of generality, I could also 
fix the range of all NPS's to be the nonstandard model M(e) discussed in Section 2.3. 
However, in the infinite case, it is not possible to restrict to a single nonstandard model, 
so I do not do so here either, for uniformity. 

Now define the mapping Fl~*n from LPS(W, J 7 ) /~ to NPS(W, J 7 ) /~ pretty much as 
suggested at the beginning of this subsection: If [p] is an equivalence class of LPS's, then 
choose a representative p! G [p] with finite length. Fix an infinitesimal e. Suppose that 
p! = (/x , . . . , n k ). Let F L ^ N ([p\) = [(1 - e e fe )/i + e/ii H h e k fx k ]. 

Theorem 4.5: IfW is finite, then F L ^ N is a bijection fromLPS(W, J 7 ) /m to NPS(W / , J 7 )/^ 
that preserves equivalence (that is, each NPS in F L _+ N ([p\) is equivalent to p). 

Proof: It is easy to check that if p = (/i , • • • ,f^k), then p pa (1 — e — • • ■ — e fc )/i + 
e/ii + • • • + e k fik (see Lemma A. 7 in the appendix for a formal proof). It follows that 
F L ^ N is an injection from LPS (W, J 7 ) / ^ to NPS(W, J 7 )/^. To show that F L ^ N is a 
surjection, we must essentially construct an inverse map; that is, given an NPS (W, J 7 , v) 
where W is finite, we must find an LPS p such that p ~ v. The idea is to find a finite 
collection /i , . . . , /i^ of (standard) probability measures, where k < \W\, and nonnegative 
nonstandard reals eo,...,6fc such that si(ej + i/ej) = and v = eo/io + ••• + ekfik- A 
straightforward argument then shows that v pa p and Fl->n([P\) — H- 1 leave details to 
the appendix. | 
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BBD [1991a] also relate nonstandard probability measures and LPS's under the as- 
sumption that the state space is finite, but there are some significant technical differences 
between the way they relate them and the approach taken here. BBD prove representa- 
tion theorems essentially showing that a preference order on lotteries can be represented 
by a standard utility function on lotteries and an LPS iff it can be represented by a stan- 
dard utility function on lotteries and an NPS. Thus, they show that NPS's and LPS's 
are equiexpressive in terms of representing preference orders on lotteries. The difference 
between BBD's result and Theorem 4.5 is essentially a matter of quantification. BBD's 
result can be viewed as showing that, given an LPS, for each utility function on lotteries, 
there is an NPS that generates the same preference order on lotteries for that particular 
utility function. In principle, the NPS might depend on the utility function. More pre- 
cisely, for a fixed LPS jl, all that follows from their result is that for each utility function 
u, there is an NPS v such that (/2, u) and (u, u) generate the same preference order on 
lotteries. Theorem 4.5 says that, given /2, there is an NPS v such that (/2, u) and (z/, u) 
generate the same preference on lotteries for all utility functions u. 

4.2 The infinite case 

An LPS over an infinite state space W may not be equivalent to any finite LPS. However, 
ideas analogous to those used to prove Proposition 4.4 can be used to provide a bound 
on the length of the minimal-length LPS's in an equivalence class. 

Proposition 4.6: Every LPS over (W, J-) is equivalent to an LPS over (W, T) of length 
at most 

The first step in relating LPS's to NPS's is to show that, just as in the finite case, 
for every LPS (fip : f3 < a) of length a, there is an equivalent NPS v. The idea will be 
to set v — (1 — Z)o</3<a £n/3 ) + J2o</3<a e n p ^/3- In the finite case, we could take np = (3. 
This worked because each (3 was finite, and the field M(e) includes e- 7 for each integer 
j. But now, since a may be greater than u>, we cannot just take np = f3. To get this 
idea to work in the infinite setting, I consider a nonstandard model of the integers, which 
includes an "integer" corresponding to all the ordinals less than a. I then construct a 
field that includes e na even for these nonstandard integers n a . 

A nonstandard model of the integers is a model that contains the integers and satisfies 
every property of the integers expressible in first-order logic. It follows easily from the 
compactness theorem of first-order logic [Enderton 1972] that, given an ordinal a, there 
exists a nonstandard model I a of the integers I a that includes elements rip, (3 < a, 
such that Uj = j for j < oo and np < np> if (3 < (3'. (Note that since I a satisfies all the 
properties of the integers, it follows that if np < npr, then np> —np > 1, a fact that will be 
useful later.) The compactness theorem says that, given a collection of formulas, if each 
finite subset has a model, then so does the whole set. Consider a language with a function 
+ and constant symbols for each integer, together with constants n^, (3 < a. Consider 
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the collection of first-order formulas in this language consisting of all the formulas true 
of the integers, together with the formulas rij = % for i < 00 and ng < n^, for all 
(3 < (3' < a. Clearly any finite subset of this set has a model — namely, the integers. 
Thus, by compactness, so does the full set. Thus, for each ordinal a, there is a model I a 
with the required properties. 

Given a, I now construct a field lR(I a ) that includes e n for each "integer" n G I a . 
To explain the construction, it is best to first consider M(e) in a little more detail. Since 
M(e) is a field, once it includes e, it must include p(e), where p is a polynomial with real 
coefficients. To ensure the every nonzero element of M(e) has an inverse, we need not 
just finite polynomials in e, but infinite polynomials in e. The inverse of a polynomial 
in e can then be computer using standard "formal" division of polynomials. Moreover, 
the leading coefficient of the polynomial can be negative. Thus, the inverse of e 3 is, not 
surprisingly, e -3 ; the inverse of 1 — e is 1 + e + e 2 + . . .. 

The field M(I a ) also includes polynomials in e, but now the exponents are not just 
integers, but elements of I a . Since a field is closed under multiplication, if it contains 
e™ 1 and e™ 2 , it must also include their product. Since I a satisfies all the properties of 
the integers, if it includes rt\ and n 2 , it also includes an element n\ + n 2 , and we can 
take e™ 1 x e™ 2 = e n i+™2. Formally, let M(I a ) be the non- Archimedean model defined as 
follows: M(I a ) consists of all polynomials of the form X)„ g j7" n e n , where r n is a standard 
real, e is an infinitesimal, and J is a well-founded subset of I a . (Recall that a set is 
well founded if it has no infinite descending sequence; thus, the set of integers is not 
well founded, since ... — 3 < —2 < —1 is an infinite descending sequence. The reason I 
require well foundedness will be clear shortly.) We can identify the standard real r with 
the polynomial re . 

The polynomials in M(I a ) can be added and multiplied using the standard rules for 
addition and multiplication of polynomials. It is easy to check that the result of adding 
or multiplying two polynomials is another polynomial in M(I a ). In particular, if pi and 
p 2 are two polynomials, N\ is the set of exponents of p±, and A 2 is the set of exponents 
of P2, then the exponents of p\ + p 2 lie in Ai U A 2 , while the exponents of p\p 2 lie in the 
set A3 = {ni + n 2 '■ n\ e Ni, n 2 G N 2 }. Both N± U N 2 and ^3 are easily seen to be well 
founded if Aq and N 2 are. Moreover, for each expression ri\ + n 2 G A^, it follows from the 
well-foundedness of Aq and A 2 that there are only finitely many pairs (n, n') G A x x A 2 
such that n + n' = ni + n 2 , so the coefficient of e ni+ ™ 2 in p\p 2 is well defined. Finally, each 
polynomial (other than 0) has an inverse that can be computed using standard "formal" 
division of polynomials; I leave the details to the reader. This step is where the well 
foundedness comes in. The formal division process cannot be applied to a polynomial 
with coefficients that are not well founded, such as • • • + e~ 3 + e~ 2 + e _1 . An element 
of lR(I a ) is positive if its leading coefficient is positive. Define an order < on lR(I a ) by 
taking a < b if b — a is positive. With these definitions, M(I a ) is a non-Archimedean 
field. 
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Given (W,? 7 ), let a be the minimal ordinal whose cardinality is greater than or 
equal to By construction, I a has elements rip for all /3 < a such that rii = i for 
% < to and rip < rip if j3 < f3' < a. I now define a map Fl_>n from LPS(W, J 7 ) / ~ 
to NPS(W, just as suggested earlier. In more detail, given an equivalence class 

G LPS(W,J 7 ), by Proposition 4.6, there exists /2' G [p\ such that /2' has length 
a' < a. Let v = (1 - Eo</?<« e n/3 Vo + Eo</3< Q e n ^. By definition, £ </?<a ^ G lR(I a ) 
(the set of exponents is well ordered since the ordinals are well ordered), hence so is 
(1 — J2o<B<a eUl3 )- The elements e n/3 for (3 < a are also all in M(I a ). It easily follows 
that v is nonstandard probability measure over the field M(I a ). As observed earlier, if 
P' < (3, then (3 — (3' > 1, so e n « is infinitesimally smaller than e™' 3 . Arguments essentially 
identical to those of Lemma A. 7 in the appendix can be used to show that v m p! . Define 
Fl^n\P\ = M- The following result is immediate. 

Theorem 4.7: F L ^ N is an injection from LPS(W / , J 7 ) /~ to NPS(W^, ^ r )/~ that preserves 
equivalence. 

What about the converse? Is it the case that for every NPS there is an equivalent 
LPS? The technique for finding an equivalent LPS used in the finite case fails. There is no 
obvious way to find a well-ordered sequence of standard probability measures /xo, A*i, • • • 
and a sequence of nonnegative nonstandard reals eo, e±, . . . such that st (e^+i/e^) = and 
v = e fi + ei/ii + • • •. As the following example shows, this is not an accident. There 
exists NPSs that are not equivalent to any LPS. 

Example 4.8: As in Example 3.3, let W = IN, the natural numbers, let T consist of 
the finite and cofinite subsets of IV, and let T' = T — {0}. Let v l be an NPS with range 
M(e), where ^{U) = \U\e if U is finite and ^(U) = 1 — \U\e if U is cofinite (as usual, U 
denotes the complement of U, which in this case is finite). This is clearly an NPS, and 
it corresponds to the cps /i 1 of Example 3.3, in the sense that st^iV \ U)) = /i 1 (V | U) 
for all V e J 7 , U G J 7 '. Just as in Example 3.3, it can be shown that there is no LPS \x 
such that v l ~ \x. 

To see the potential relevance of this setup, suppose that a natural number is chosen 
at random and, intuitively, all numbers are equally likely to be chosen. An agent may 
place a bet on the number being in a finite or cofinite set. Intuitively, the agent should 
prefer a bet on a set with larger cardinality. More precisely, if JJ\ and JJi are two sets in 
the algebra, the agent should prefer a bet on U\ over a bet on Ui iff (a) U\ and U2 are 
both cofinite and the complement of U\ has smaller cardinality than that of U?, (b) U\ 
is cofinite and U2 is finite, or (c) U\ and £/ 2 are both finite, and U\ has larger cardinality 
than U 2 . These preferences on acts or bets should translate to statements of likelihood. 
The NPS captures these preferences directly; they cannot be captured in an LPS. The 
cps of Example 3.3 captures (b) directly, and (c) indirectly: when conditioning on any 
finite set that contains U\ U U 2 , the probability of U\ will be higher than that of U 2 - I 
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4.3 Countably additive nonstandard probability measures 

Do things get any better if countable additivity is required? To answer this ques- 
tion, I must first make precise what countable additivity means in the context of non- 
Archimedean fields. To understand the issue here, recall that for the standard real 
numbers, every bounded nondecreasing sequence has a unique least upper bound, which 
can be taken to be its limit. Given a countable sum each of whose terms is nonnega- 
tive, the partial sums form a nondecreasing sequence. If the partial sums are bounded 
(which they are if the terms in the sums represent the probabilities of a pairwise disjoint 
collection of sets), then the limit is well defined. 

None of the above is true in the case of non- Archimedean fields. For a trivial coun- 
terexample, consider the sequence e, 2e, 3e, . . .. Clearly this sequence is bounded (by any 
positive real number), but it does not have a least upper bound. For a more subtle 
example, consider the sequence 1/2, 3/4, 7/8, ... in the field JR(e). Should its limit be 1? 
While this does not seem to be an unreasonable choice, note that 1 is not the least upper 
bound of the sequence. For example, 1 — e is greater than every term in the sequence, 
and is less than 1. So are 1 — 3e and 1 — e 2 . Indeed, this sequence has no least upper 
bound in M(e). 

Despite these concerns, I define limits in M(I*) pointwise. That is, a sequence 
ai,a 2 ,a 3 , ... in 1R(I*) converges to b e M(I*) if, for every n G /*, the coefficients of 
e n in ai, a 2 , a 3 , . . . converge to the coefficient of e n in b. (Since the coefficients are stan- 
dard reals, the notion of convergence for the coefficients is just the standard definition of 
convergence in the reals. Of course, if e n does not appear explicitly, its coefficient is taken 
to be 0.) Note that here and elsewhere I use the letters a and b (possibly with subscripts) 
to denote (standard) reals, and e to denote an infinitesimal. As usual, YliZi a i is taken to 
be b if the sequence of partial sums Yh=i a i converges to b. Note that, with this notion of 
convergence, 1/2,3/4,7/8,... converges to 1 even though 1 is not the least upper bound 
of the sequence. 6 I discuss the consequences of this choice further in Section 7. 

With this notion of countable sum, it makes perfect sense to consider countably- 
additive nonstandard probability measures. If T is a cr-algebra and LPS C (W, J 7 ) and 
NPS C {W,F) denote the countably additive LPS's and NPS's on {W,F), respectively, 
then Theorem 4.7 can be applied with no change in proof to show the following. 

Theorem 4.9: F L ^ N is an injection from LPS C (W, J 7 )/^ to NPS^W 7 ", J 7 ) /ph. 

However, as the following example shows, even with the requirement of countable 
additivity, there are nonstandard probability measures that are not equivalent to any 
LPS. 

Example 4.10: Let W = {wi,W2,W3, . . .}, and let T = 2 W . Choose any nonstandard 
I* and fix an infinitesimal e in M(I*). Define an NPS (W,J-,v) with range M(I*) by 

6 For those used to thinking of convergence in topological terms, what is going on here is that the 
topology corresponding to this notion of convergence is not Hausdorff. 
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taking v{wj) = aj + bj€, where a,j = 1/2- 7 , &2j-i = 1/2-' 1 , and b 2 j = — 1/2- 7 1 , for 
j = 1,2,3, . . .. Thus, the probabilities of wi,W2, ■ ■ ■ are characterized by the sequence 
1/2 + e, 1/4 - e, 1/8 + e/2, 1/16 - e/2, 1/32 + e/4, .... For f/ C W, define u(U) = 
H{j: Wj eu} a j + e J2{j-. Wj €U} bj- It is easy to see that these sums are well-defined. These 
likelihoods correspond to preferences. For example, an agent should prefer a bet that 
gives a payoff of 1 if w 2 occurs and otherwise to a bet that gives a payoff of 4 if 
occurs and otherwise. As I show in the appendix (see Proposition A. 9), there is no 
LPS fl over (W, J 7 ) such that v jl. | 

Roughly speaking, the reason that v is not equivalent to any LPS in Example 4.10 
is that the ratio between aj and bj in the definition of v (i.e., the ratio between the 
"standard part" of v{wj) and the "infinitesimal part" of v(wj)) goes to zero. This can 
be generalized so as to give a condition on nonstandard probability measures that is 
necessary and sufficient to guarantee that they can be represented by an LPS. However, 
the condition is rather technical and I have not found an interesting interpretation of it, 
so I do not pursue it here. 

5 Relating Popper Spaces to NPS's 

Consider the map F N _, P from nonstandard probability spaces to Popper spaces such that 
F N ^ P (W,F,v) = (W,?,! 7 ',^), where J 7 ' = {U : v{U) ^ 0} and fj,(V \ U) = st{v{V \U)) 
for V e J 7 , U G J-'. I leave it to the reader to check that (W, J 7 , J 7 ', /j) is indeed a Popper 
space. This is arguably the most natural map; for example, it is easy to check that 
Fjy^p o Fs^n = Fs^p, where Fs^n is the restriction of Fl_>n to SLPSs. (Note that 
Fl->n is well-defined on SLPS's, since if jl is an SLPS, by Proposition 4.3, [jJ] = {jl}-) 

We might hope that F N ^ P is a bijection from NPS(W, J 7 )/^ to PopiW.J 7 ). As I 
show shortly, it is not. To understand Fl_>n better, define an equivalence relation ~ on 
NPSiW^J 7 ) (and NPS C (W : J 7 )) by taking v x ~ u 2 if {U : vi(U) = 0} = {U : u 2 (U) = 0} 
and stiyxiy \ U)) = st{v 2 {V \ U)) for all V, U such that u^U) ^ 0. Thus, ~ essentially 
says that infinitesimal differences between conditional probabilities do not count. Let 
NPS / ~ (resp., NPS C / ~) consist of the ~ equivalence classes in NPS (resp., NPS C ). 
Clearly F N _> P is well defined as a map from NPS / ~ to Pop(W, J 7 ) and from NPS C / ~ to 
Pop c {W 1 J 7 ). As the following result shows, F N _> P is actually a bijection from NPS C / ~ 
to Pop^W^J 7 ). 

Theorem 5.1: F N ^ P is a bijection from NPS(W / ", J 7 )/ ~ to Pop(W / ", J 7 ) and fromNPS c (W, J 7 )/ ~ 
to VotfiW^J 7 ). 

Proof: It is easy to see that F^^p is an injection. In the countable case, the inverse 
map can be defined using earlier results. If (W, J 7 , J 7 ', ji) G Pop c (W, J 7 ), by Theorem 3.5, 
there is a countably additive SLPS fl' such that F S ^ P ({W, J 7 , fl')) = (W, J 7 , J 7 ' , fi) . By 
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Theorem 4.7, there is some (W,^, v) E NPS C (W,J 7 ) such that v w pi'. It is not hard 
to show that F^^p(W, J 7 , v) = (W, J 7 , J 7 ', //); see the appendix for details. Showing that 
Fn-^p is a surjection in the finitely additive case requires more work; again, see the 
appendix for details. I 

McGee [1994] proves essentially the same result as Theorem 5.1 in the case that T is 
an algebra (and the measures involved are not necessarily countably additive). McGee 
[1994, p. 181] says that his result shows that "these two approaches amount to the same 
thing". However, this is far from clear. The ~ relation is rather coarse. In particular, it 
is coarser than ps. 

Proposition 5.2: If v x ~ v 2 then v x ^ v 2 . 

The converse of Proposition 5.2 does not hold in general. As a result, the ~ relation 
identifies nonstandard measures that behave quite differently in decision contexts. This 
difference already arises in finite spaces, as the following example shows. 

Example 5.3: Suppose W = {wi,w 2 }. Consider the nonstandard probability measure 
v\ such that v\{w\) = 1/2 + e and vi(w 2 ) = 1/2 — e. (This is equivalent to the LPS (/ii, fJ, 2 ) 
where Hi{w-i) = n 2 {w 2 ) — 1/2, l^ 2 (wi) — 1, and fJ. 2 (w 2 ) — 0.) Let v 2 be the nonstandard 
probability measure such that v 2 (wi) — ^2(^2) = 1/2. Clearly v\ ~ is 2 . However, it is 
not the case that v 1 m v 2 . Consider the two random variables an d X{w 2 }- (1 use 

the notation xu to denote the indicator function for U; that is, Xu( w ) = 1 if w E U and 
Xu( w ) — otherwise.) According to Ui, the expected value of is (very slightly) 

higher than that of X{w 2 }- According to i/ 2 , X{w-i} and X{w 2 } have the same expected 
value. Thus, v\ ^ v 2 . Moreover, it is easy to see that there is no Popper measure \i on 
{wi,w 2 } that can make the same distinctions with respect to an d X{w 2 } as ^1, no 

matter how we define expected value with respect to a Popper measure. According to 
vi, although the expected value of X{wi} is higher than that of X{w 2 }, the expected value 
of X{w!} is less than that of otX{w 2 } for any (standard) real a > 1. There is no Popper 
measure with this behavior. | 

More generally, in finite spaces, Theorem 3.1 shows that Popper spaces are equivalent 
to SLPS's, while Theorem 4.5 shows that LPSiW^J 7 )/^ is equivalent to NPS(W, J 7 )/^. 
By Proposition 4.3, SLPS{W, J-)/~ is essentially identical to SLPS(W, T) (all the equiv- 
alence classes in SLPS(W, are singletons), so in finite spaces, the gap in expres- 
sive power between Popper spaces and NPS's essentially amounts to the gap between 
SLPS(W,J 7 ) and LPS(W, J 7 )/^. This gap is nontrivial. For example, there is no SLPS 
equivalent to the LPS (fj,i, /i 2 ) that represents the NPS in Example 5.3. 

6 Independence 

The notion of independence is fundamental. As I show in this section, the results of 
the previous sections sheds light on various notions of independence considered in the 
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literature for LPS's and (variants of) cps's. I first consider independence for events and 
then independence for random variables. I then relate my definitions to those of BBD, 
Hammond, and Kohlberg and Reny [1997]. 

Intuitively, event U is independent of V if learning U gives no information about 
V . Certainly if learning U gives no information about V, then if \x is an arbitrary 
probability measure, we would expect that /i(V \ U) = fJ>(V). Indeed, this is often taken 
as the definition of V being independent of U with respect to fi. If standard probability 
measures are used, conditioning is not defined if fi(U) = 0. In this case, U is still 
considered independent of V. As is well known, if U is independent of V, then fj,(U C\V) — 
fJ>{V) x /i(U) and V is independent of U, that is, /i(U \ V) = /i(U). Thus, independence 
of events with respect to a probability measure can be defined in any of three equivalent 
ways. Unfortunately, these definitions are not equivalent for other representations of 
uncertainty (see [Halpern 2003, Chapter 4] for a general discussion of this issue). 

The situation is perhaps simplest for nonstandard probability measures. 7 In this 
case, the three notions coincide, for exactly the same reasons as they do for standard 
probability measures. However, independence is perhaps too strong a notion in some 
ways. In particular, nonstandard measures that are equivalent do not in general agree 
on independence, as the following example shows. 

Example 6.1: Suppose that W = {^1,^2,^3,^4}. Let Ui(wi) = 1 — 2e + e^, Vi{w2) = 
i/j(«)3) = e — €i, and Ui(w^) = for i — 1,2, where e\ = e 2 and e 2 = e 3 . If U — {w 2 , W4} 
and V = {u>3,u>4}, then Vi{U) = Vi(V) = e and Ui(U f]V) — 6j. It follows U and V are 
independent with respect to ui, but not with respect to u 2 . However, it is easy to check 
that Vi v 2 - I 

Example 6.1 shows that independence of events in the context of nonstandard mea- 
sures is very sensitive to the choice of e, even if this choice does not affect decision 
making at all. This suggests the following definition: U is approximately independent of 
V with respect to v if v(U) 7^ implies that u(V \ U) — v(V) is infinitesimal, that is, 
if st(v(y I U)) — st(v{y)). Note that U can be approximately independent of V with- 
out V being approximately independent of U . For example, consider the nonstandard 
probability measure v 1 from Example 6.1. Let V = {u> 4 }; as before, let U = {^2,^4}. 
It is easy to check that st (i>i(V | U)) = st^V')) = 0, but stiy^U \ V')) = 1, while 
st(ui(U)) = 0. Thus, U is approximately independent of V with respect to ui, but V 
is not approximately independent of U. Similarly, U can be approximately independent 
of V without U being approximately independent of V. For example, it is easy to check 
that V is approximately independent of U with respect to ui, although V is not. 

A straightforward argument shows that U is approximately independent of V with 
respect to v iff v{U) ^ implies st((u(V n U) - v{V) x v(U))/v(U)) = 0, while V is 

7 Although I talk about U being independent of V with respect to a nonstandard measure v, technically 
I should talk about U being independent of V with respect to an NPS (W, J 7 , v), for U, V E T . I continue 
to be sloppy at times, reverting to more careful notation when necessary. 
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approximately independent of U with respect to v iff the same statement holds with 
the roles of V and U reversed. Note for future reference that each of these require- 
ments is stronger than just requiring that st(u(V DC/) — v(V) x u(U)) = 0. The latter 
requirement is automatically met, for example, if the probability of either U or V is 
infinitesimal. 

The definition of (approximate) independence extends in a straightforward way to 
(approximate) conditional independence. U is conditionally independent of V given V 
with respect to a (standard or nonstandard) probability measure v if v(U D V) 7^ implies 
v{V I U fl V) = i/(V | V). Again, for probability, U is conditionally independent of V 
given V iff V is conditionally independent of U given V iff u(V fl C/ | V) = v(V \ V) x 
v(U I V). £/ is approximately conditionally independent of V given V with respect to 
v if si(i/(V I C/ n V')) = st{v{V I V')). If V is taken to be W, the whole space, then 
(approximate) conditional independence reduces to (approximate) independence. 

The following proposition shows that, although independence is not preserved by 
equivalence, approximate independence is. 

Proposition 6.2: If U is approximately conditionally independent of V given V' with 
respect to v , and v u' ' , then U is approximately conditionally independent of V given 
V with respect to v' . 

Proof: Suppose that v 1/ '. I claim that for all events U 1 and U 2 such that V\{U 2 ) 7^ 0, 
st(v{Ui)/u{U 2 )) = st(v'{Ui)/v'{U 2 )). For suppose that st^viUx) /u{U 2 )) = a. Then it 

easily follows that E v (xui) < E v {a'xu 2 ) f° r an a ' > a > an d ^(x^i) > E u (a"xu 2 ) f° r an 
a" < a. Thus, the same must be true for E u /, and hence st (u' (Ui) / u' (U 2 )) = a. It thus 
follows that st(u(V\UnV)) = st (u'(V \ U n V')) and st(v(V\V')) = st(u'(V \V')), 
from which the result is immediate. | 

There is an obvious definition of independence for events for Popper spaces: U is 
independent of V given V with respect to the Popper space (W : J 7 , J 7 ', //) if U fl V e J 7 ' 
implies that fi(V \ U (lV) = n{V \V')\ if [/ D V ^ J^', then [/ is also taken to be 
independent of V given V'. If U is independent of V given V' and V G J 7 ', then 
//([/ fl V I V) = /i(£7 I V) x //(V I V'). However, the converse does not necessarily hold. 
Nor is it the case that if U is independent of V given V then V is independent of U 
given V . A counterexample can be obtained by taking the Popper space arising from 
the NPS in Example 6.1. Consider the Popper space (W, 2 W , J 7 ', jj) corresponding to the 
NPS (W, 2 W , vi) via the bijection F^^p. It is easy to check that U is independent of V 
but V is not independent of U with respect to this Popper space, although /j,(V D U) — 
n{U\ V) x /i(V') (= 0). This observation is an instance of the following more general 
result, which is almost immediate from the definitions: 

Proposition 6.3: U is approximately independent of V given V with respect to the 
NPS (W, J 7 , v) iff U is independent of V given V with respect to the Popper space 
F^piW^J 7 ^). ' 
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How should independence be denned in LPS's? Interestingly, neither BBD nor Ham- 
mond define independence directly for LPS's. However, they do give definitions in terms 
of NPS's that can be applied to equivalent LPS's; indeed, BBD [1991b] do just this (see 
the discussion of BBD strong independence below). 

I now consider independence for random variables. If X is a random variable on W, 
let V(X) denote range (set of possible values) of random variable X; that is, V(X) = 
{X{w) : w G W}. Recall that I am assuming that all random variables have countable 
range. Random variable X is independent of Y with respect to a standard probability 
measure [i if the event X = x is independent of the event Y = y with respect to /x, for all 
x G V(X) and y G V(Y). By analogy, for nonstandard probability measures, following 
Kohlberg and Reny [1997], define X and Y to be weakly independent with respect to v if 
X = x is approximately independent of Y = y and Y — y is approximately independent 
of X = x with respect to v for all x G V(X) and y G V(Y). 8 

For standard probability measures, it easily follows that if X is independent of Y, 
then X G U± is independent of Y G V\ conditional onY eV 2 and Y G V± is independent 
of X G E/i conditional on X G C/ 2 , for all U 1: U 2 C V(X) and Vi, V 2 C V(F). The same 
arguments show that this is also true for for nonstandard probability measures. However, 
the argument breaks down for approximate independence. 

Example 6.4: Suppose that W = {1,2,3} x {1,2}. Let X and Y be the random 
variables that project onto the first and second components of a world, respectively, so 
that X(i,j) = i and Y(i,j) = j. Let v be the nonstandard probability measure on W 
given by the following table: 





Y = 1 


Y = 2 


X = 1 


1 - 3e - 3e 2 


e 


X = 2 


e 


e 2 


X = 3 


e 


2e 2 



It is easy to check that X and F are weakly independent with respect to u, for all 
% G {1,2,3}, j G {2,3}. However, st (u(X = 2 | X G {2, 3} n Y = 2)) = 1/3, while 
s£(z/(X = 2|X G {2,3})) = 1/2. | 

In light of this example, I define X to be approximately independent of {Y ± , . . . , Y n } 
with respect to v if X G U\ is approximately independent of (Y 1 G Vi) Pi ... Pi (Y n G V n ) 
conditional on (Y 1 G V{) H ... Pi (Y n G K) with respect to v for all Z7i C V(X), VJ, V^/ C 
V(Yi), and « = 1, . . . , n. X ± , . . . , X n are approximately independent with respect to v if 

8 Kohlbcrg and Reny's definition of weak independence also requires that the joint range of X and Y 
be the product of the individual ranges. That is, for X and Y to be weakly independent, it must be the 
case that for all x e V(X) and y e V(Y), there exists some w & W such that X(w) — x and Y(w) = y. 
Of course, this requirement could also be added to the definition I am proposing here; adding it would 
not affect any of the results of this paper. 
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is approximately independent of {X 1; . . . , X n } — {Xi} with respect to v for i — 1, . . . , n. I 
leave to the reader the obvious extensions to conditional independence and the analogues 
of this definition for Popper spaces and LPS's. 

As I said, BBD consider three notions of independence for random variables. One 
is a decision-theoretic notion of stochastic independence on preference relations on acts 
over W . Under appropriate assumptions, it can be shown that a preference relation is 
stochastically independent iff it can be represented by some (real- valued) utility function 
u and a nonstandard probability measure v such that X±, ...,X n are approximately 
independent with respect to v [Battigalli and Veronesi 1996]. A second notion they 
consider is a weak notion of product measure that requires only that there exist measures 
vi,...,i/ n such that st((v(wi, . . . ,w n )) = st(i>i{wi) x ■••u{w n )). As we have already 
observed, this notion of independence is rather weak. Indeed, an example in BBD shows 
that it misses out on some interesting decision-theoretic behavior. 

The third notion of independence that BBD consider is the strongest. BBD [1991b] 
define X±, . . . ,X n to be strongly independent with respect to an LPS fl if they are in- 
dependent (in the usual sense) with respect to an NPS v such that ji ~ v. 9 Moreover, 
they give a characterization of this notion of strong independence, which I henceforth call 
BBD strong independence, to distinguish it from the KR notion of strong independence 
that I discuss shortly. Given a tuple f — (r°, . . . , r fe_1 ) of vectors of reals in (0, l) k and a 
finite LPS fl = (/z°, . . . , fj, k ), let fl □ r be the (standard) probability measure 

(1 _ r o );U o + r o [(1 _ r iy + r i [(1 _ r 2y + r 2[. . . + r k-2 [{l _ r *-i )Ai *-i + r *-y)] . . .]]]. 

Note that fl □ r is defined only if fl is finite. Thus, in discussing BBD strong independence, 
I restrict to finite LPS's. In addition, for technical reasons that will become clear in the 
proof of Theorem 6.5, I consider only random variables with finite range, which is what 
BBD do as well. BBD [1991b, p. 90] claim without proof that "it is straightforward to 
show" that Xi, . . . ,X n are BBD strongly independent with respect to fl iff there is a 
sequence r 5 , j — 1, 2, ... of vectors in (0, l) k such that r 5 — > (0, . . . , 0) as j — > oo, and 
Xi, . . . , X n are independent with respect to fl □ r 5 for j = 1,2,3,.... I can prove this 
result only if the NPS v such that fl ~ v and Xi, . . . ,X n are independent with respect 
to v has a range that is an elementary extension of the reals (and thus has the same 
first-order properties as the reals). 

Theorem 6.5: There exists an NPS v whose range is an elementary extension of the 
reals such that fl ~ v and Xi, . . . , X n are independent with respect to v iff there exists a 
sequence , j = 1,2,... of vectors in (0, l) h such that r 5 — > (0, . . . , 0) as j — > oo ; and 
Xi, . . . , X n are independent with respect to flop for j = 1, 2, 3, . . .. 

9 In [Blume, Brandenburger, and Dekel 1991b], BBD say that this definition of strong independence 
is given in [Blume, Brandenburger, and Dekel 1991a]. However, the definition appears to be given only 
in terms of NPS's in [Blume, Brandenburger, and Dekel 1991a]. 
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I do not know if this result holds without requiring that v be an elementary extension of 
the reals. 

Kohlberg and Reny [1997] define a notion of strong independence with respect to what 
they call relative 'probability spaces, which are closely related to Popper spaces of the form 
(W, 2 W , 2 W — {0}, /x), where all subsets of W are measurable and it is possible to condition 
on all nonempty sets. Their definition is similar in spirit to the characterization of BBD 
strong independence given in Theorem 6.5. For ease of exposition, I recast their definition 
in terms of Popper spaces. X±, . . . ,X n are KR-strongly independent with respect to the 
Popper space (W, J-', /x), where T' includes all events of the form Xi = x for x G V(Xf), 
if there exist a sequence of standard probability measures /xi,/i2, . . . such that /Xj — > /x, 
and for all j = 1, 2, 3, . . ., fij(U) > for U E T' and X 1: . . . , X n are independent with 
respect to fij. As Kohlberg and Reny show, KR-strong independence implies approximate 
independence 10 and is, in general, strictly stronger. 

The following theorem characterizes KR strong independence in terms of NPS's. 

Theorem 6.6: Xi, . . . , X n are KR-strongly independent with respect to the Popper space 
(W, J 7 , J 7 ', fx) iff there exists an NPS (W, J 7 , v) such that F N ^ P (W, J 7 , v) = (W, J 7 , J 7 ', /x) 
and Xi, . . . , X n are independent with respect to (W, J 7 , v). 

It follows from the proof that we can require the range of v to be a nonelementary 
extension of the reals, but this is not necessary. 

Kohlberg and Reny show that their notions of weak and strong independence can be 
used to characterize Kreps and Wilson's [1982] notion of sequential equilibrium. BBD 
[1991b] use their notion of strong independence in their characterization of perfect equi- 
librium and proper equilibrium for games with more than two players. Finally, Battigali 
[Battigalli 1996] uses approximate independence (or, equivalently, independence in cps's) 
to characterize sequential equilibrium. 

7 Discussion 

As the preceding discussion shows, there is a sense in which NPS's are more general 
than both Popper spaces and LPS's. It would be of interest to get a natural charac- 
terization of those NPS's that are equivalent to Popper spaces and LPS's; this remains 
an open problem. LPS's are more expressive than Popper measures in finite spaces and 
in infinite spaces where we assume countable additivity (in the sense discussed at the 
end of Section 5), but without assuming countable additivity, they are incomparable, as 
Examples 3.3 and 3.4 show. Since all of these approaches to representing uncertainty 
have been using in characterizing solution concepts in extensive-form games and notions 

10 They actually show only that it implies weak independence, but the same argument shows that it 
implies approximate independence. 
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of admissibility, the results here suggest that it is worth considering the extent to which 
these results depend on the particular representation used. 

It is worth stressing here that this notion of equivalence depends on the fact that 
I have been viewing cps's, LPS's, and NPS's as representations of uncertainty. But, 
as Asheim [2006] emphasizes, they can also be viewed as representations of conditional 
preferences. Example 5.3 shows that, even in finite spaces, NPS's and LPS's can express 
preferences that cps's cannot. However, as Asheim and Perea [2005] point out, in finite 
spaces, cps's can also represent conditional preferences that cannot be represented by 
LPS's and NPS's. See [Asheim 2006] for a detailed discussion of the expressive power of 
these representations with respect to conditional preferences. 

Although NPS's are the most expressive of the three approaches I have considered, 
they have some disadvantages. In particular, working with a nonstandard probability 
measure requires defining and working with a non-Archimedean field. LPS's have the 
advantage of using just standard probability measures. Moreover, their lexicographic 
structure may give useful insights. It seems to be worth considering the extent to which 
LPS's can be generalized so as to increase their expressive power. In particular, it may 
be of interest to consider LPS's indexed by partially ordered and not necessarily well- 
founded sets, rather than just LPS's indexed by the ordinals. For example, Branden- 
burger, Friedenberg, and Keisler [2008] characterize n rounds of iterated deletion using 
finite LPS's, for any n. Rather than using a sequence of (finite) LPS's of different lengths 
to characterize (unbounded) iterated deletion, it seems that a result similar in spirit can 
be obtained using a single LPS indexed by the (positive and negative) integers. 

I conclude with a brief discussion of a few other issues raised by this paper. 

• Belief: The connections between LPS's, NPS's, and cps's are relevant to the notion 
of belief. There are two standard notions of belief that can be defined in LPS's. Say 
that U is a certain belief in LPS fl of length a if flp{U) = 1 for all (3 < a; U is weakly 
believed if /jl (U) = 1. Brandenburger, Friedenberg, and Keisler [2008] defined a 
third notion of belief, intermediate between weak and strong belief, and provided an 
elegant decision-theoretic justification of it. According to their definition, an agent 
assumes U in fl if there is some (3 < a such that (a) Hp>{U) = 1 for all (3' < (3, 
(b) /ip/'(U) = for all (3" > (3, and (c) U C Up'^pSuppdip), where Supp(npi) 
denotes the support of the probability measure (Condition (c) is unnecessary 
if W is finite, given Brandenburger, Friedenberg, and Keisler's assumption that 
W = U/3rSupp(fij3>).) There are straightforward analogues of certain belief and weak 
belief in Popper spaces. U is strongly believed in a Popper space (W, J 7 , J 7 ', /x) if 
H{U | V) = 1 for all V E J 7 '; U is weakly believed if fi(U \ V) = 1 for all V £ J 7 ' 
such that /i(V) > 0. Analogues of this notion of assumption have been considered 
elsewhere in the literature. Van Fraassen [1995] independently defined a notion of 
belief using Popper spaces; in a finite state space, an event is what van Fraassen 
calls a belief core iff it is assumed in the sense of Brandenburger, Friedenberg, and 
Keisler. Battigalli and Siniscalchi's [2002] notion of strong belief is also essentially 
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equivalent. Assumption also corresponds to Stalnaker's [1998] notion of absoutely 
robust belief and Asheim and S0vik's [2005] notion of robust belief. Asheim and 
S0vik [2005] do a careful comparison of all these notions (and others). 

It is easy to define analogues of certain and weak belief in NPS's: U is certain belief 
if v(U) — 1; U is weakly believed if st(v(U)) = 1. The results of this paper suggest 
that it may also be worth investigating an analogue of assumption in NPS's. 

• Nonstandard utility: In this paper, while I have allowed probabilities to be lexi- 
cographically ordered or nonstandard, I have implicitly assumed that utilities are 
standard real numbers (since I have restricted to real- valued random variables). 
There is a tradition in decision theory going back to Hausner [1954] and continued 
recently in a sequence of papers by Fishburn and Lavalle (see [Fishburn and Lavalle 
1998] and the references therein) and Hammond [1999] of considering nonstandard 
or lexicographically-ordered utilities. I have not considered the relationship be- 
tween these ideas and the ones considered here, but there may be some fruitful 
connections. 

• Countable additivity for NPS's: Countable additivity for standard probability mea- 
sures is essentially a continuity condition. The fact that Y^Zi a % ma Y n °t be the 
least upper bound of the partial sums J27=i a % i n an NPS leads to a certain lack 
of continuity in decision-making. For example, let W = {wi,w 2 , ■ ■ • }• Consider a 
nonstandard probability measure v such that v(w\) = 1/3 — e, z/(u> 2 ) = 1/3 + e, 
and v{wk+2) = 1/(3 x 2 fc ), for k = 1,2,.... Let U n = {w 3 , . . . ,w n } and let 
Uoo = {w 3 ,w 4 , . . .}. Clearly v(U n ) — > v(Uoo) = 1/3. However, v{U n ) < v(w{) 
for all n. Thus, E u (x {wi} ) > E v (xu„) for all n > 3 although E u (x {wi} ) < E v (x Uoo )- 

Not surprisingly, the same situations can be modeled with LPS's. Consider the LPS 
(111,112), where ^ = st(v), (J, 2 (wi) = 0, /i 2 (w 2 ) = 2/3, and fi 2 (w k+2 ) = 1/(3 x 2 k ) 

for k — 1,2, It is easy to see that again Ep(x{ Wl }) > ^p(Xu„) f° r all n > 3 

although Ejiixiw^) < E v (xUoo)- (A similar example can be obtained using SLPS's, 
by replacing each world Wi by a pair of worlds w'^w", where w[ is in the support 
of \X\ and w" is in the support of ji 2 -) 

An analogous continuity problem arises even in finite domains. Let W = {wi, 102,103} 
and consider a sequence of probability measures u n such that v n { w \) = 1/3 — 1/n, 
^71(^2) — 1/3 — e and v(w 3 ) — 1/3 + l/n + e. Clearly u n — > v, where v{w\) = 1/3, 
v{w 2 ) = 1/3 - e, and u(w 3 ) = 1/3 + e. However, v n (x{ Wl }) < Vn(X{w 2 }) f° r all n, 
while ^(x{wi}) > v {X{w 2 })- Again, the same situation can be modeled using LPS's 
(and even SLPS's). 

Of course, continuity plays a significant role in standard axiomatizations of SEU, 
and is vital in proving the existence of a Nash equilibrium. None of the uses of 
continuity that I am familiar with have the specific form of this example, but I 
believe it is worth considering further the impact of this lack of continuity. 
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A Appendix: Proofs 

In this section, I prove all the results claimed in the main part of the paper. For the 
convenience of the reader, I repeat the statements of the results. 

Theorem 3.1: IfW is finite and (J 7 , J 7 '), then F s ^p is a bijection from SLPS(W / , J 7 , J 7 ') 
to ¥ov(W,F,F). 

Proof: The first step is to show that F s ^p is an injection. If /x, jl' G SLPS(W, J 7 , J 7 ') and 
jl 7^ jl', let \i = Fs->p(W, J 7 , jl), and let // = Fs^p(W,J 7 ,jl'). Let i be the least index 
such that fii /x^. There is some set U such that fii(U) ^ /^[(U). Let Ui be the set such 
l^i(Ui) = 1 and fij{Ui) = for j < i; since jl is an SLPS, such a set Ui exists. Similarly, let 
U[ be such that ^{Uj) = 1 and (J,'j(Ui) = for j < i. Since fij = /x^ for all j < i, we must 
have pLjiUi U UI) = ^(U U [/?) = for all j < i. Clearly jj(Uj U U'j) > 0, so Uj U U] G J 7 '. 
Moreover, fi(U | Ui U U[) = m{U \ Ui U uf) = fn(U). Similarly, //([/ \ U U U[) = ^(U). 
Hence, \i ^ /x'. 

To show that Fs-,p is a surjection, given a cps /x, let jl = (/xo, • • • be the LPS 
constructed in the main text. We must show that Fs^p(jl) = (W, J 7 ', J 7 ' , /x) . Suppose 
that Fs^p(jl) = (yV,T,T" ,y!). I first show that J 7 ' = J 7 ". Suppose that V G J 7 ". Then 
Hi(V) > for some %. Thus, /x(V | U { ) > 0. Since U G J 7 ', it follows that V G J 7 '. Thus, 
T" C .F'. 

To show that J 7 ' C JF", first note that, by construction, /i(Uj \ U U . . . U Z7j_i) = 1. 
It easily follows that if V C C/o U . . . U C/j_i then 

/x(v | c/ u . . . u c/j-i) = /x(y n c/j | [/ u...u[/ H ). 

Thus, by CP3, 

/x(y | c/ u . . . u c/j-i) = /x(y n Uj \ u u . . . u t/^) = /x(v | c/,-) x ^(Uj | (/ u...uf/ H ), 

so 

/x(y I c/,) = /x(y | f/ u...uf/ H ). (i) 
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Now suppose that V E J 7 '. Clearly V n (U U . . . U U k ) ^ 0, for otherwise V C 
Uo U . . . U C/fe, contradicting the fact that (7o U . . . U (7^ ^ JF'. Let jy be the small- 
est index j such that V H Uj 7^ 0. I claim that //(V | £/ U . . . U U jv ^) ^ 0. For if 
/j,(V j C/ U . . . U £/j v -i) = 0, then fi(Uj v — V \ U U . . . U Uj v -i) = 1, contradicting the defi- 
nition of C/j v as the smallest set [/' such that | C/ U . . . U Uj v -\) = 1. Moreover, since 
V C C/ U . . . C/j V -i> 11 follows from (1) that fi(V \ U jv ) = fx(V \ U U . . . U U jv ^) > 0. 
Thus, Hj v {V) > 0, so V E J 7 ". 

This argument can be extended to show that fi(V \ V) = //'(V | V) for all V G 
J 7 . Since V n t/ j = for j < j v , it follows that n'(V \ V) = pi jv (V' \ V). By CP3, 
H(V I V) x n(V | C/ U . . . U C/jv-i) = »( v ' n ^ I U U...UU jv ^). By (1) and the fact 
that fx(V | C/, v ) > 0, it follows that fx(V \ V) = fj,(V n V \ U jv )/fi(V \ U jv ), that is, that 
n{V'\V)= Nv {V'\V). I 

Although Theorem 3.5 was proved by Spohn [1986], I include a proof here as well, to 
make the paper self-contained. 

Theorem 3.5: For all W, the map Fs^p is a bijection from SLPS°(W, J 7 , J 7 ') to 
Pop c (W,F,F). 

Proof: Again, the difficulty comes in showing that Fs^p is onto. As it says in the main 
text, given a Popper space (W, J 7 , J 7 ', //), the idea is to construct sets Uo, Ui, . . . and an 
LPS p, such that ^(V) = fi(V \ Up), and show that F S ^ P (W, J 7 , ft) = (W, J 7 , J 7 ', //). The 
construction is somewhat involved. 

As a first step, put an order < on sets in J 7 ' by defining U < V if /j,(U \ U U V) > 0. 
(Essentially, the same order is considered by van Fraassen [1976].) 

Lemma A.l: < is transitive. 

Proof: By definition, if U < V and V < V, then fi(U \ UL)V) > and fx(V \ VUV') > 0. 
Tosee that fi(U \ UUV') > 0, note that fi(U \ UUV UV') +fi{v \ UUV UV') + fi(V' \ UU 
VUV') = 1, so at least one of n{U \ UU V U V), n{V \ U U V U V), or fj,(V \UUVUV') 
is positive. I consider each of the cases separately. 

Case 1: Suppose that fi{U \ U U V U V) > 0. By CP3, 

fi(U | U U V U V) = fi(U | U U V) x fi(U U V | U U V U V). 
Thus, fi{U \UUV')>0, as desired. 

Case 2: Suppose that /i(V \ U U V U V) > 0. By assumption, /x(C7 | C7 U V) > 0; since 
^(V \U\JV\JV')> 0, it follows that (j,(U L)V \ U UV L)V) > 0. Thus, by CP3, 

//([/ 1 c/ u v u y') = //(c/ 1 u u y) x n(u u y | c/ u v u y') > o. 

Thus, case 2 can be reduced to case 1. 
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Case 3: Suppose that //(V \ UL)VL)V) > 0. By assumption, //(V | V U V) > 0; since 
fx(V \UUVUV')>0,it follows that n(V \JV' \U UV UV') > 0. Thus, by CP3, 

fi(y\uuvuv')=fi(y\vuv')xfi(vuv'\uuvu V) > o. 

Thus, case 3 can be reduced to case 2. 

This completes the proof, showing that < is transitive. I 

Define U ~ V if U < V and V < U. 
Lemma A. 2: ~ is an equivalence relation on T' . 

Proof: It is immediate from the definition that ~ is reflexive and symmetric; transitivity 
follows from the transitivity of <. | 

Renyi [1956] and van Fraassen [1976] also considered the ~ relation in their papers, 
and the argument that < is transitive is similar in spirit to Renyi's argument that ~ is 
transitive. However, the rest of this proof diverges from those of Renyi and van Fraassen. 

Let [U] denote the ~-equivalence class of U, and let jF'/~= {[U] : U G J 7 '}. 

Lemma A. 3: Each equivalence class [V] G is closed under countable unions. 

Proof: Suppose that V U V 2 ,... G [V]. I must show that U^Vj G [V]. Clearly Vj < 
U^Vi for all j. Suppose, by way of contradiction, that U^Vi % Vj for some j. Since < 
is transitive, it follows that Vj < for all j. Thus, ^(Vj \ V) = for all j. But 

then, by countable additivity, 

oo 

i = ^T=iVi | uZi Vi) < J2KVj I V) = o, 

a contradiction. Thus, [V] is closed under countable unions. I 

Fix an element V G [V]. 
Lemma A.4: inf{^(Vb | V U V) : V G [V]} > 0. 

Proof: Suppose that inf{^(Vb | V U V) : V G [V]} — 0. Then there exist sets V u V 2 , . . . 
such that fi{V | V UV n ) < 1/n. Since [V] is closed under countable unions, U" =1 T^ G [V]. 
Since V ~ Uf =1 V, it follows that fi(V \ U~ V-) > 0. But, by CP3, 

//(Vb | U~ Vi) = fi(V | V U K) x pl(V U K | U~ V) < pi(V \ V U V n ) < 1/n. 

Since this is true for all n > 0, it follows that /i(Vo | U^ V^) = 0, a contradiction. | 

The next lemma shows that each equivalence class in jF'/~ has a "maximal element". 
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Lemma A. 5: In each equivalence class [V], there is an element V* G [V] such that 
fx(V* | V U V*) = 1 for all V G [V]. 

Proof: Again, fix an element Vq G [V]. By Lemma A. 4, there exists some ay > such 
that inf{/i(V | V U V) : V G [V]} = a v . Thus, there exist sets VI, V 2 , V 3 , . . . G [V] such 
that /i(V |y UV„)<a + l/n. By Lemma A.3, V* = U~ V; G [V]. By CP3, 

H(V Q | V*) = /i(V | V U K) x fi(y U K I V*) < n{V Q I Vo U V n ) <a v + l/n. 

Thus, ^(Vo | V*) < ay. By choice of oy, it follows that fi(V \ V*) = ay- 
Suppose that n{V* |y'UV*)<l for some V G [V]. But then, by CP3, 

n(v 1 V U V*) = fj,(y I V*) x //(F* | V u \/*) < a v , 

contradicting the choice of a v . Thus, fi(V* \V'UV*) = 1 for all V E [V]. § 

Define a total order on these equivalence relations by taking [U] < [V] if U' < V 
for some U' G [U] and V G [V]. It is easy to check (using the transitivity of <) that if 
U' < V for some U' G [U] and some V G [V], then U" < V" for all U" G [C/] and all 
V" G [V]. 

Lemma A. 6: < is a well-founded relation on jF'/~. 

Proof: Note that if [U] < [V], then fi(V \ U U V) = 0. It now follows from countable 
additivity that < is a well-founded order on these equivalence classes. For suppose that 
there exists an infinite decreasing sequence [Uq] > [Ui] > \Uq\ > .... Since T is a 
a-algebra, U°^ £/j G JF; since JF' is closed under supersets, U°l Ui G T' . By CP3, 

| U£, Ui) = I ^ U t^+i) x U C/ i+1 | U~ Ui) = 0. 

Let Vo = Uq and, for j > 0, let Vj = Uj — (L^Z^Uj). Clearly the V/s are pairwise disjoint, 
UjC/j = UjVj, and /i(V^- | U^ C/j) < //(C/j | U^ C/j) = 0. It now follows that using countable 
additivity that 

oo 

1 = ^(UZoUi I U^ Ui) = Y,»(Vi | U^ Ui) = 0. 

i=0 

This is as contradiction, so the equivalence classes are well-founded. | 

Because < is well-founded, there is an order-preserving bijection O from jF'/~ to an 
initial segment of the ordinals (i.e., [U] < [V] iff ([[/]) < 0([V]). Thus, the equivalence 
classes can be enumerated using all the ordinals less than some ordinal a. By Lemma A. 5, 
there are sets Up, f3 < a, in T' such that if O ([{/]) = P, then Up G [U] and /i(Up \ UUUp) = 
1 for all U' G [U]. Define an LPS fx = (/i , /ii, . . .) of length a by taking fJ>p(V) = fJ,(V \ Up). 
The choice of the Up's guarantees that this is actually an SLPS. 
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It remains to show that (W, J 7 , J 7 ', //) is the result of applying F s ^p to (W, J 7 , pi). 
Suppose that instead (W, J 7 , J 7 ", //) is the result. The argument that T" C JF' is identical 
to that in the finite case: If V G J 7 ", then /i^(V) > for some f3. Thus, fi(V \ Up) > 0. 
Since G .F', it follows that V G .T. Thus, T" C J 7 '. 

Now suppose that V G JF'. Thus, V ~ V/3 for some j3 < a. It follows that //(V | V^) > 
0, so V G J"". 

Finally, to show that /jl(U \ V) = /jl'(U | V), suppose that (3 is such that V ~ Vp. It 
follows that //(y | Vp,) = for f3' < (3 and ^(V | Vp) > 0. Thus, by definition, //([/ | V) = 
A*a(^ I K)- Without loss of generality, assume that U C.V (otherwise replace U by UCW). 
Thus, by CP3, 

H(U\V) x n{V\VUVp) =fi(U\VU Vp). (2) 
Suppose V C V. Clearly 

//(y' | y u t^) = nv^|vu^) + M^' n Vp\ v u v^). 

Now by CP3 and the fact that pi(Vp \ V U Vp) = 1, 

n^|7U^) = /i(V" I V» x ^ \VUVp)= n{Y I v» 

and 

niy' nv~p\v uVp) < n(Vp l y u Vp) = o. 

Thus, I VU Vg) = | V^). Applying this observation to both U and V shows that 
H(V \VUVp) = fi(V | Vp) and fi(U \VL)Vp) — fi(U \ Vp). Plugging this into (2), it follows 
that 

pt(U | V) = pt(U I Vp)/pi(V I Vp) = pip(U)/pip(V) = ptp(U I V) = fjf(U I V). 
This completes the proof of the theorem. | 

Proposition 3.9: The map F s ^p is a surjection from SLPS C ( W, J 7 , J 7 ') onto T C (W, J 7 , J 7 ') . 

Proof: Suppose that fj, G T C (W, J 7 , J 7 '). I want to construct an SLPS \x G SLPS C (W, J 7 , J 7 ') 
such that Fs-fpip) = /i. I first label each element of J 7 ' with a natural number. Intu- 
itively, if U £ J 7 ' is labeled k, then k will be the least index such that Hk{U) > 0. The 
labeling is done by induction on k. Each topmost set in the forest (i.e., the root of some 
tree in the forest) is labeled 0, as are all sets U' such that ji(U' \ U) > 0, where U is a 
topmost node. These are all the nodes labeled by 0. Label all the maximal unlabeled 
sets by 1 (that is, label U G J 7 ' by 1 if it is not labeled 0, and is not a subset of another 
unlabeled set); in addition, label a set U' by 1 if fi(U f \ U) > and U is labeled by 1. 
Note that every set at depth or 1 in the forest is labeled by either or 1. 

Suppose that the labeling process has been completed for labels 0, . . . , k such that 
the following properties hold, where label (U) denotes the label of the event U: 
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• all sets up to depth k in the forest have been labeled; 

• if labeliU) = k', U' G J 7 ', and fi(U' \ U) > 0, then label(U') < label{U). 

Label all the maximal unlabeled sets with k + 1; in addition, if U' is unlabeled and 
fj,(U' | U) > for some [/ such that label(U) — k + 1, then assign label + 1 to [/'. 
Clearly the two properties above continue to hold. This completes the labeling process. 

Let Ck be the set of maximal sets in T' labeled k. T2 and T3 guarantee that, for all k, 
the sets in Ck are disjoint. Let /i' k be an arbitrary probability on W such that n k {U) > 
for all U G Ct and Y,uec k a4(^0 = 1- Define an LPS jl = (/i , /-ti, . . .) as follows (where the 
length of /I is a; if Cfc 7^ for all fc, and is + 1 if k is the largest integer such that Ck ^ 0). 
For l/6f, let ^(V) = £[/ e£ v //(V | U)^). I now show that /T(V | [/) = //(V | 17) for 
all V <E J 7 and C/ G T' . Suppose that U G Ck- Then /ij(U) = for all j < A;, and 
fJ>k(U) > 0. Thus, jl(V | [/) = yu fc (\/ | U). But it is immediate from the definition that 
H k (V | U) = n{V I U). Thus, Fs^p(fl) = At. Moreover, if f/ G J 7 ' and Za6d(C/) = fc, let U' 
be the maximal set containing U such that label(U') = k. (The labeling guarantees that 
such a set exists.) Then Hk{U') = fi(U' \ U) > 0. It follows that /I(t/) > for all u G J 7 '. 
Finally, note that jl is an SLPS (in fact, an LCPS). If U k = UC k — Ufc/^UC^), then the 
sets Uk are disjoint, and /ifc(£4) = 1- I 

Proposition 4.2: If v & jl, then v(U) > iff (2(U) > 0. Moreover, if v(U) > 0, 
then st {v{V \ U)) = /ij(V \ U), where jij is the first probability measure in jl such that 
N {U) > 0. 

Proof: Recall that for U C W, xu is the indicator function for U; that is, Xu{w) = 1 
if w G U and Xu( w ) — otherwise. Notice that E u (xu) > E u (x<t>) iff V (U) > and 
E p (xu) > E p ( X $) iff ${U) > 0. Since v « /?, it follows that i/(C/) > iff /?([/") > 
0. If i/(C/) > 0, note that E u (xunv ~ rxu) > E v (x$) iff ?" < st(u(V\U)). Similarly, 
Ep(xunv — r Xu) > Ep(x<b) iff r < fJ>j(U), where j is the least index such that fJ-j(U) > 0. 
It follows that st{v{V \ U)) = fj,j(V \ U). I 

Proposition 4.3: If ft, jl' G SLPS(I¥, T), then jl « jl' iff jl = jl'. 

Proof: Clearly jl — jl' implies that jl ~ jl'. For the converse, suppose that jl ~ jl' for 
jl, jl' G SLPS(W ) T). If jl 7^ /2', let o; be the least ordinal such that fi a ^ fi' a , and let U be 
such that fi a (U) ^ n' a (U). Without loss of generality, suppose that fi a (U) > n' a {U). Let 
the sets Up be such that np{Up) = 1 and f/,p(U-,) = if 7 > f3; similarly choose the sets 
Up. Since jip = fi'p for (5 < a, it follows that fip(U a U C/^) = [x'p{U a U C/^,) = for (5 < a; 
moreover, fi a (U a U U' a ) = [i' a (U a U U' a ) = 1. Choose r such that fi a (U) > r > ^' a {U). Let 
X be the random variable xu — r Xu a uu^ and let Y = x%- Then Ep(X) > Ep(Y), while 
Ep,(X) < EpOT), so/? $6/7. I 
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Proposition 4.4: IfW is finite, then every LPS over (W, T) is equivalent to an LPS 
of length at most \Basic(J I ')\. 

Proof: Suppose that W is finite and Basic(J r ) = {U\, . . . , £4}. Given an LPS /2, define 
a finite subsequence jl' = (//&„, . . . , fi kh ) of jl as follows. Let ji ko = fj, . Suppose that 
/ifc , . . . , /Xfe. have been defined. If all probability measures in jl with index greater that 
kj are linear combinations of the probability measures with index fj,^, ...,//£., then take 
jl' = (/ifc , . . • , //fc-)- Otherwise, let /ifc- +1 be the probability measure in jl with least index 
that is not a linear combination of . . . , /i fc .. Since a probability measure over (W, JF) 
is determined by its value on the sets in Basic{J-), a probability measure over (W, J 7 ) 
can be identified with a vector in ul- Bos ' c ( J7 )l : the vector defining the probabilities of 
the elements in Basic(J-). There can be at most \Basic(T) | linearly independent such 
vectors, thus jl' has length at most \Basic(J-)\. 

It remains to show that jl' is equivalent to jl. Given random variables X and Y, 
suppose that Ep(X) < Ep(Y). Then there is some minimal index (3 such that E^(X) = 
E flj (Y) for all 7 < f3 and E^JX) < E^JY). It follows that fip cannot be a linear 
combination of // 7 for 7 < /3. Thus, //^ is one of the probability measures in jl'. Moreover, 
the expected value of X and Y agree for all probability measures in jl' with lower index 
(since they do in jl). Thus, Ep(X) < Epi(X). 

The argument in the other direction is similar in spirit and left to the reader. | 

Theorem 4.5: IfW is finite, thenF L ^ N is a bijection from LPS (W, T) to NPS(W, 
that preserves equivalence (that is, each NPS in F L ^ N ([jl]) is equivalent to jl). 

Proof: I first provide a sufficient condition for an NPS to be equivalent an LPS in a 
finite space. 

Lemma A. 7: Suppose that jl = (/x , • • • , Hk), and eo, . . . , e k are such that st (ej+i/cj) = 
for i — 1, . . . , k — 1 and X^=o e j = 1- Then jl ~ e /io + • • • + tkl^k- 11 

Proof: Suppose that there exist e, . . . , as in the statement of the lemma and v = 
e oA*o + • • • + tk^k- I want to show that jl ~ v. 

If Ep(X) < Ep(Y), then there exists some j < k such that E H (X) < E H (Y) 
and E^(X) = E^(Y) for all f < j. Since E V (X) = YJU^E^X) and E V {Y) = 
T,i=o£iE^( Y )' to snow tliat E A X ) < E v{Y), it suffices to show that e j (E lx .(Y) - 
E H {X)) > E?=i+iei(^W - ^OH)- Since e /+1 < e ,, for / > j (this follows 
from the fact that st(e f+1 /ef) = 0), it follows that Ei=j+i e i( E m( x ) ~ E ^( Y )) < 
e j+ iEi= j+ i \E^(X) - E^(Y)\. Thus, it suffices to show that e j+1 TLj+i \ E m( X ) ' 

11 Although I do not need this fact here, it is easy to see that if W is finite and fl = (/io, • • • , Mfc) is an 
SLPS in LPS(W, T), then the converse of Lemma A. 7 holds as well: if v w jl, then 1/ = e 0y uo + ■ ■ ■ CfeMfe 
for some e , ■ ■ ■ ,Hk are such that (ej + i/ej) = for i = 1, . . . , k — 1 and Ei=o e * = conjecture this 
fact is true in general, not just if jl is an SLPS, but I have not checked this. 
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E^(Y)\ < £j(E H (Y) - E H (X)). This is trivially the case if E^X) = E^(Y) for all % 
such that j + 1 < i < k. Thus, assume without loss of generality that Yn=j+i \ E m(X) — 
Em(Y)\ > 0. In this case, it suffices to show that e j+1 /€j < (E H (Y)-E H (X))/ Ef=,+i 
E^(Y)\. Since the right-hand side of the inequality is a positive real and st (ej+i/e,) = 0, 
the result follows. 

The argument in the opposite direction is similar. Suppose that E U (X) < E U (Y). 
Again, since E V (X) = Ei=o e i E ^( x ) and E »( Y ) = Th=o^ e ^{ y ), it must be the case 
that if j is the least index such that E N (X) ^ E H (Y), then E H (X) < E N (Y). Thus, 
E p (X) < Ejx(Y). It follows that | 

It remains to show that, given an NPS {W,!F, u), there is an equivalence class [/I] 
such that Fl->n([P]) = \y\- As I said in the main text, the goal now is to find (standard) 
probability measures /Jo, • • • , l^k and eo, ... ,6k such that st (ej+i/ej) = and v = eo/io + 
• • • + ekfik- If this can be done then, by Lemma A. 7, v ps (/xq, • • • , /ife), and we are done. 

Suppose that Basic{T) = {Ui, . . . , [/&} and that z/ has range Note that a prob- 
ability measure v' on JF can be identified with a vector (a±, . . . ,ak) over iR*, where 
v'{Ui) = di, so that ai + • • • + Ofe = 1. In the rest of this proof, I frequently identify v 
with such a vector. 

Lemma A. 8: There exist k' < k, eo, ...,€# where e = 1, st(e i+ i/ej) = for % = 

— * , 

1, . . . , k' — 1, and standard real-valued vectors bj, j = 0, . . . , k' , in M such that 

k' 

3=0 

Proof: I show by induction on m < k that there exist eo, . . . , e m and m' < m such 
that €j = for j' > m', st(t i+ i/ei) — for i — 1, . . . , w! — 1, and standard vectors bj 

— * 

j — 0, . . . , m — 1 and a possibly nonstandard vector b' m = (b' ml , . . . , b' mk ) such that (a) 
v — YJj=® tjbj + e m b' m , (b) \b' mi \ < 1, and (c) at least m of b' ml , . . . , b' mk are standard. 

— # 

For the base case (where m = 0), just take b' = v and eo = 1. For the inductive 
step, suppose that < m < k. If b' m is standard, then take 6 m = 6^, 6 m+ i = 0, and 
e m+ i = 0. Otherwise, let 6 m = st(b' n ^j and let = b' m — b m . Let e' = max{|6|' m+1 ^| : 

i = l,...,k}. Since not all components of b' m are standard, e' > 0. Note that, by 
construction, st(e'/b m i) = if b mi ^ 0, for i = l,...,k. Let = b'^ +1 /e' and let 

e m+1 = e'e m . By construction, |&( m+1 v| < 1 and at least one component of b' m+1 is either 
1 or —1. Moreover, if b' mi is standard, then fc'/ m+1 v = &( m+ i)j = 0. Thus, has at 

least one more standard component that b' m . Since clearly v = J2]Lo e jbj + £m+ib' m+ i, this 
completes the inductive step. The lemma follows immediately. I 

Returning to the proof of Theorem 4.5, I next prove by induction on m that for 
all m < k' (where k' < k is as in Lemma A. 8), there exist standard probability 
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— * — * J 

measures /i , . . . , /i m , (standard) vectors b m+ i, . . . , by G M , and ei, . . . , ey such that 

— * 

The base case is immediate from Lemma A. 8: taking bj, j = l,...,k' as in Lemma A. 8, 

— # — # 

b is in fact a probability measure since b = st(u). Suppose that the result holds for 
to. Consider b m+ i. If &( m +i)j < for some j then, since z/(£/j) > 0, there must ex- 
ist j' G {1,...,to} such that fij/(Ui) > 0. Thus, there exists some N > such that 
N(fj,ji(Ui)) + 6( m +i)i > 0. Since there are only finitely many basic elements and every 
element in the vector /ij is nonnegative, for j = 0, . . . , to, there must exist some N' such 
that b' m+1 — N'(fj, H h /x m ) + &m+i > 0. Let c = £*=i &} m+1)i , and let /i m+ i = b' m+1 /c. 

Clearly, i/ = (e - A^'e m+ i)/x H (e m - iV'e m+1 )/i m + ce m+1/ u m+1 + Ejl m +2^'- This 

completes the proof of the inductive step. 

The theorem now immediately follows. | 

Proposition 4.6: Every LPS over (W, J 7 ) is equivalent to an LPS over (W, T) of length 
at most 

Proof: The argument is essentially the same as that for Proposition 4.4, using the 
observation that a probability measure over (W, T) can be identified with an element of 
IR}^; the vector defining the probabilities of the elements in T . I leave details to the 
reader. | 

Proposition A. 9: For the NPS (W, J 7 , v) constructed in Example \.10, there is no LPS 
jl over (W, T) such that v jl. 

Proof: I start with a straightforward lemma. 

Lemma A. 10: Given an LPS jl, there is an LPS jl' such that and all the 

probability measures in jl 1 are distinct. 

Proof: Define jl' to be the subsequence consisting of all the distinct probability measures 
in jl. That is, suppose that jl = (/i , /ii, . . .). Then jl' = (fj, ko , /i kl , . . .), where k = 0, 
and, if k a has been defined for all a < (5 and there exists an index 7 such that /ik a 7^ 
for all a < (3, then kp is the least index 5 such that fi ka 7^ //<$ for all a < (3. If there is no 
index 7 such that /i 7 ^ {Hk a '■ ct < (3}, then fH = (/i ka : a < (3). I leave it to the reader 
to check that jl ~ jl'. | 

Returning to the proof of Proposition A. 9, suppose by way of contradiction that v jl. 
Without loss of generality, by Lemma A. 10, assume that all the probability measures in 
jl are distinct. Clearly E v {\w) < E u (ax{ Wl }) if a > 2 and E u (xw) > E v (ax{ Wl }) 
if a < 2. Since v w jl, it must be the case that Ep(xw) < Ep(ax{ Wl }) if a > 2 
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and Ep(xw) > E^(ax{ Wl }) if a < 2. Since E$(xw) — (1, 1, • • •) ? ^ follows that if 
jl = (/x , fJ-i, ■ ■ • ), it must be the case that fJ>o(wi) = 1/2 and 

^iW > 1/2. (3) 

Similar arguments (comparing xw to X{w }) can be used to show that fio(vjj) = 1/2 J and 
M^j-i) > 1/2 J for j = 1,2,.... Next, observe that E v ( X { Wl } -^ 2k ~ l X{ W2k }) = (2 fc + l)e. 
Thus, 

E U ( X{W1} - 2 2fe - 1 XW} ) = E u ((2 k + 1)( X{W1} - (xw/2))). 
It follows that the same relationship must hold if v is replaced by jl. That is, 

AiiM - 2 2fe -Vi(w 2fe ) = (2 fc + - (1/2)). 

Rearranging terms, this gives 

2ViK) + 2 2fc -Vi(^ 2fc ) = 2*" 1 + 1/2, 



or 

/iiK) + 2 k - 1 fi 1 (w 2k ) = 1/2 + l/2 fe+1 . (4) 

Thus, //i(wi) < 1/2 + l/2 fc+1 for all fc > 1. Putting this together with (3), it follows that 
H>i{w\) = 1/2. Plugging this into (4) gives /ii(u>2fc) = l/2 2k . It now follows that jii = /i , 
contradicting the choice of jl. I 



Theorem 5.1: F N ^ P is a bijection from NPS(W, J 7 )/ ~ to Pop(W, JF) and from 
NPS c (Vy,^")/~ to Pop^W 7 ",^). 

Proof: As I said in the main text, the proof that F N ^ P is an injection is straightforward, 
and to prove that it is a surjection in the countably additive case, it suffices to show that 
F N ^ P (W, J 7 , v) = (W, J 7 , J 7 ', /1), where v ss jl' and jl' is the countably additive SLPS such 
that Fs^pdW^J 7 ^')) = (W,^,^',^). I now do this. 

Suppose that F N ^ P (W, J 7 , v) = (W : J 7 , J 7 ^^). First I show that u(U) = iff pf(U) = 
0. Let X = xu and Y = X %- Note that u{U) = iff ^(X) = E V {Y) iff £y (X) = ^(F) 
iff jx'{U) = 0. Thus, ^ = {U : i/(C/) ^ 0} = {U : /?(17) ^ 0} = J 7 '. 

Now suppose by way of contradiction that fi 7^ fi\. Thus, there must exist some 
V E J 7 , U £ J 7 ' such that /j,(V \ U) 7^ fii(V\U). Let /3 be the smallest ordinal such that 
fJ>'p(U) 7^ 0. It follows that fjfg(V \ U) 7^ st(v{V \ U)). We can assume without loss of gen- 
erality that /x'JV I U) > st(is(V I U)). Choose a real number r such that fx'g{V \ U) > r > 
st(v(V\U)). Then Ep(xvnu) > Ep( r Xu) but E v (xvnu) < E u (rxu)- This contradicts 
the assumption that jl' m v. It follows that F N ^ P (W ) J 7 , v) = (W, J 7 ', J 7 ' ', /x) , as desired. 

It remains to show that if (W, J 7 , J 7 ', fx) E PopiW^J 7 ) - Pop c (W : J 7 ) : then there is 
some (W, J 7 , v) e NPS(W, T) such that F N ^ P (W, Tv) = (W, J 7 , J 7 ', li). My proof in this 
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case follows closely the lines of an analogous result proved by McGee [1994]. I provide 
the details here mainly for completeness. 

The proof relies on the following ultrafilter construction of non-Archimedean fields. 
Given a set S, a filter Q on S is a nonempty set of subsets of T that is closed under 
supersets (so that if U G Q and U C U', then U' G Q), is closed under finite intersections 
(so that if U 1 ,U 2 G G, then Lq n U 2 G Q), and does not contain 0. An ultrafilter is a 
maximal filter, that is, a filter that is not a strict subset of any other filter. It is not hard 
to show that if U is an ultrafilter on S, then for all U C S, either U G U or U ElA [Bell 
and Slomson 1974]. 

Suppose F is either J? or a non-Archimedean field, J is an arbitrary set, and U 
is an ultrafilter on J. Define an equivalence relation ~^ on F J by taking (aj : j G 
J) ~w (bj '■ j G J) if {j : Oj = bj} G W. Similarly, define a total order ^ by taking 
(a.,- : j G J) (foj : j G J) if {j : % < 6j} G W. (The fact that <u is total uses the fact 
that for all U C J, either [/ E U oi U ElA. Note that the pointwise ordering on F J is 
not total.) Let F J j^u consist of these equivalence classes. Note that F can be viewed 
as a subset of F J by identifying a G F with the sequence of all a's. 

Define addition and multiplication on F J pointwise, so that, for example, (aj : j G 
J) + (bj : j E J) — (dj + bj : j G J). It is easy to check that if (a,- : j G J) ~^ (a^ : j G J), 
then (a.,- : j G J) + (fej : j G J) ~^ (a^- : j G J) + (fej : j G J), and similarly for 
multiplication. Thus, the definitions of + and x can be extended in the obvious way to 
F J /~ u . With these definitions, it is easy to check that F J j^u is a held that contains F . 

Now given a Popper space (W, J 7 , J 7 ', ji) and a finite subset A = {U±, . . . , Uk} C 
JF, let JF4 be the (finite) algebra generated by A (that is, the smallest set containing 
{Ui, . . . , Uk, W} that is closed under unions and complement). Let !F' A = JF4 n T' . It 
follows from Theorem 3.1 that there is a finite SLPS \xa over (W, Ta) that is mapped to 
(W, F'a'i A 4 -^) Fs^p- (Although Theorem 3.1 is stated for finite state spaces W, the 
proof relies on only the fact that the algebra is finite, so it applies without change here.) 
It now follows from Theorem 4.5 that, for each A, there is a nonstandard probability 
space (W, JF4, vj) with range JR(e) that is equivalent to /I4. By Proposition 4.2, it follows 
that for U G T\ iff v A (U) = 0. Moreover, st(u A (V \ U)) = fi A (V \ U) for U G T' A and 
V G T A . 

Let J consist of all finite subsets of T . For a subset A of JF, let be the subset of 
2 J consisting of all sets in J containing A. Let Q = {G C J : G D G A for some A C JF}. 
It is easy to check that ^ is a filter on J. It is a standard result that every filter can be 
extended to an ultrafilter [Bell and Slomson 1974]. Let U be an ultrafilter containing Q. 
By the construction above, lZ(e)/r^ u is a non- Archimedean field. 

Define v on (W, T) by taking v(U) = (v A (U) : Ae J), where v A (U) is taken to be 
if U ^ JF4. To see that z/ is indeed a nonstandard probability measure with the required 
properties, note that clearly v(W) = 1 (where 1 is identified with the sequence of all 
l's). Moreover, to see that v(U) + v(V) = v(U U V), let Au,v be the smallest subalgebra 
containing U and V. Note that if A D Auy, then v A (U) + ^(V) = ^'(C/ U V). Since 
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the set of algebras containing Au,v is an element of the ultrafilter, the result follows. 
Similar arguments show that v(U) — iff U £ T' and that st{v(V \ U)) = fx(V \ U) if 
U e P and V e T. Clearly F N ^ P (u) = fx. | 

Proposition 5.2: If v\ pa v 2 then v\ ~ z/ 2 . 

Proof: Suppose that v x ~ ^2- To show that z/ x ~ z/ 2 , first suppose that V\{U) 7^ 
for some U C W. Then £ , I/1 (%0) < E Ul (xu)- Since z/i ps u 2 , it must be the case that 
Ev 2 {x$) < E V2 {xu)- Thus, 1^2 (C/) 7^ 0. A symmetric argument shows that if z/ 2 ([/) 7^ 
then v\{U) 7^ 0. Next, suppose that vi(U) 7^ and fi(V| U) = a. Thus, E Ul (axu) = 
^(Xi/nv)- Since v 1 pa v 2 , it follows that E U2 (axu) = E U2 (xunv), and so u 2 (V \U) — a. 
Thus, si(i/i(V r I U)) — st(u 2 (V \U)). Hence, 1^ ~ i/ 2 , as desired. | 

Theorem 6.5: There exists an NPS v whose range is an elementary extension of the 
reals such that ft pa v and Xi, . . . , X n are independent with respect to v iff there exists a 
sequence , j = 1,2,... of vectors in (0, l) k such that P — > (0, . . . , 0) as j — > oo ; and 
Xi, . . . , X n are independent with respect to flop for j = 1,2,3,.... 

Proof: Suppose that there exists an NPS v whose range is an elementary extension of the 
reals, fx ~ is, and X±, . . . , X n are independent with respect to v. Using arguments similar 
in spirit to those the arguments of BBD [1991b, Proposition 2], it follows that there exist 
positive infinitesimals ei, ... ,6k such that fx □ (ei, . . . , e^) = v. It is not hard to show that 
there exist a finite set of real-valued polynomials Pi, ■ ■ ■ ,Pn such that Pj(ei, • • • , efc) =0 
for j — 1, . . . , N and if f is a vector of positive reals such that Pj(r) = for j = 1, . . . , N, 
then Xi, . . . ,X n are independent with respect to ftDf. Thus, for all natural numbers 
m > 1, the range of v satisfies the first-order property 

ztai . . . 3xk(pi(x\, . . . , Xk) = OA. . .Apn(%i, ■ ■ ■ , Xk) = 0A0 < x\ < 1/mA. . .AO < Xk < l/m). 

Since the range of v is an elementary extension of the reals, this first-order property 
holds of the reals as well. Thus, there exists a sequence r 5 of vectors of positive reals 
converging to such that Pj(P) = for j — 1, . . . , N. 

The converse follows by a straightforward application of compactness in first-order 
logic [Enderton 1972]. Suppose that there exists a sequence P , j = 1,2,... of vectors in 
(0, l) fc such that P ; — > (0, . . . , 0) as j — > 00, and X±, . . . , X n are independent with respect 
to flop for j = 1,2,3, . . .. We now apply the compactness theorem. As I mentioned 
in the proof of Proposition 4.6, the compactness theorem says that, given a collection 
for formulas, if each finite subset has a model, then so does the whole set. Consider a 
language with the function symbols + and x, the binary relation <, a constant symbol 
r for each real number r, a unary predicate N (representing the natural numbers), and 
constant symbols pu for each set U G J- '. Intuitively, pu represents v(U). Consider the 
following (uncountable) collection of formulas: 
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(a) All first-order formulas in this language true of the reals. (This includes, for exam- 
ple, a formula such as "ixiy{x-\-y = y+x), which says that addition is commutative, 
as well as formulas such as 2 + 3 = 5 and \/2 x y/3 = \/6.) 

(b) Formulas p v > for U G F' and p v = for U G F - F. 

(c) Formulas Pu + Pv = Puuv if U D V — 0. 

(d) The formula pw = 1- 

(e) Formulas of the form p Xl = Xl x • • • x px„=i„ = Px 1 =x 1 n...nx n =x n , for all values a* G 
V(Xj), i = l,...,n; these formulas say that Xi,...,X n are independent with 
respect to v. 

(f) For every pair of Y, Y' of random variables such that Ep(Y) > Ep(Y'), a formula 
that says E V {Y) > E V (Y'), where E V {Y) and E V {Y') are expressed using the con- 
stant symbols pu (where the events U are those of the form Y — y and Y' = y'). 
Note that this formula is finite, since X and Y are assumed to have finite range. 
The formula would not be expressible in first-order logic if X or Y had infinite 
range. 

It is not hard to show that every finite subset of these formulas is satisfiable. In- 
deed, given a finite subset of formulas, there must exist some m such that taking 
pu = jlnf l7l (U) will work (and interpreting r as the real number r, of course). The 
only nonobvious part is showing that we can deal with the formulas in part (f); that 
we can do so follows from the proof of Proposition 1 in [Blume, Brandenburger, and 
Dekel 1991b], which shows that Ep(Y') > Ep(Y) iff there exists some M such that 
Ep QF m(Y') > Ep □ j>m (Y) for all m, then Ep(Y') > Ep{Y). 

Since every finite set of formulas is satisfiable, by compactness, the infinite set is 
satisfiable. Let v(U) be the interpretation of pu in a model satisfying these formulas. 
Then it is easy to check that v is an elementary extension of the reals, v jl, and that 
Xi, . . . , X n are independent with respect to v. | 

Theorem 6.6: Xi, . . . ,X n are strongly independent with respect to the Popper space 
(W, F, F, fi) iff' there exists an NPS (W, F, v) such that F N ^ P (W, F, v) = (W, F, F, //) 
and Xi, . . . , X n are independent with respect to (W, F, v). 

Proof: It easily follows from Kohlberg and Reny's [1997, Theorem 2.10] characterization 
of strong independence that if Xi, . . . , X n are independent with respect to the NPS 
(W, F, v) then X-y, . . . , X n are strongly independent with respect to F N _> P (W, F, v). 

The converse follows using compactness, much as in the proof of Theorem 6.5. Sup- 
pose that (W, F, F', fi) is a Popper space and fij — > fi are as required for Xi, . . . , X n to 
be strongly independent with respect to fi. Consider the same language as in the proof 
of Theorem 6.5, and essentially the same collection of formulas, except that the formulas 
of part (f) are replaced by 
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(f ) Formulas of the form (r — ^)pv < Punv < (r + ^)pv for all U, V, r, and n > 
such that fx(U \ V) — r. 

Again, it is easy to see that every finite subset of these formulas is satisfiable. Indeed, 
given a finite subset of formulas, there must exist some m such that taking pu = p m (U) 
satisfies all the formulas (and interpreting r as the real number r, of course). By com- 
pactness, the infinite set is satisfiable. Let v(U) be the interpretation of pu in a model 
satisfying these formulas. Then it is easy to check that F L ^ N (W, J 7 , u) = (W, T ', T 1 , p) , 
and that Xi, . . . , X n are independent with respect to v. | 
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